Python interface to the SuiteSparse Matrix Collection

This notebook walks you through some of the features of the ssgetpy package that provides a search and download interface for the Suite Sparse matrix collection.

The simplest way to install ssgetpy is via:

pip install ssgetpy

This installs both the ssgetpy Python module as well as the ssgetpy command-line script.

This notebook only covers the library version of ssgetpy. To get more information on the command-line script run:

$ ssgetpy --help

Before proceeding with the rest of this notebook, please install ssgetpy into your environment. If you are running this notebook under Binder, ssgetpy will already be installed for you. If you are running this notebook in Google Colaboratory, the following cell will install ssgetpy:

In [1]:
ipy = get_ipython()
if 'google.colab' in str(ipy):
  import sys
  ipy.run_cell('!{sys.executable} -m pip install ssgetpy')

First import ssgetpy via:

In [2]:
import ssgetpy

Basic query interface

The primary interface to ssgetpy is via ssgetpy.search. Running search without any arguments returns the first 10 matrices in the collection:

In [3]:
ssgetpy.search()
Out[3]:
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
1HB1138_bus113811384054realNoYes1.01.0power network problem
2HB494_bus4944941666realNoYes1.01.0power network problem
3HB662_bus6626622474realNoYes1.01.0power network problem
4HB685_bus6856853249realNoYes1.01.0power network problem
5HBabb3133131761557binaryNoNo0.00.0least squares problem
6HBarc1301301301037realYesNo0.760.0materials problem
7HBash21921985438binaryNoNo0.00.0least squares problem
8HBash2922922922208binaryNoNo1.01.0least squares problem
9HBash331331104662binaryNoNo0.00.0least squares problem
10HBash6086081881216binaryNoNo0.00.0least squares problem

Notice that search result comes with minimal Jupyter integration that shows some metadata along with the distribution of the non-zero values. Click on the group or name link to go a web page in the SuiteSparse matrix collection that has much more information about the group or the matrix respectively.

Query filters

You can add more filters via keyword arguments as follows:

Argument Description Type Default Notes
rowbounds Number of rows tuple: (min_value, max_value) (None, None) min_value or max_value can be None which implies "don't care"
colbounds Number of columns tuple: (min_value, max_value) (None, None)
nzbounds Number of non-zeros tuple: (min_value, max_value) (None, None)
isspd SPD? bool or None None None implies "don't care"
is2d3d 2D/3D Discretization? bool or None None
dtype Non-zero data type real, complex, binary or None None
group Matrix group str or None None Supports partial matches; None implies "don't care"
kind Problem domain str or None None Supports partial matches; None implies "don't care"
limit Max number of results int 10

Note that numerical and pattern symmetry filters are not yet supported.

As an example of using the above filters, here is a query that returns five, non-SPD matrices with $1000\leq \text{NNZ} \leq 10000$:

In [4]:
ssgetpy.search(nzbounds=(1000,10000), isspd=False, limit=5)
Out[4]:
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
5HBabb3133131761557binaryNoNo0.00.0least squares problem
6HBarc1301301301037realYesNo0.760.0materials problem
8HBash2922922922208binaryNoNo1.01.0least squares problem
10HBash6086081881216binaryNoNo0.00.0least squares problem
12HBash9589582921916binaryNoNo0.00.0least squares problem

Working with search results

The result of a search query is a collection of Matrix objects. The collection can be sliced using the same syntax as for vanilla Python lists as shown below:

In [5]:
result = ssgetpy.search(kind='structural', nzbounds=(1000,10000))
result[:4]
Out[5]:
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
24HBbcsstk0266664356realYesYes1.01.0structural problem
26HBbcsstk041321323648realYesYes1.01.0structural problem
27HBbcsstk051531532423realYesYes1.01.0structural problem
28HBbcsstk064204207860realYesYes1.01.0structural problem

An individual element in the collection can be used as follows:

In [6]:
small_matrix = result[0]
small_matrix
Out[6]:
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
24HBbcsstk0266664356realYesYes1.01.0structural problem
In [7]:
small_matrix.nnz
Out[7]:
4356

We can download a matrix locally using the download method:

In [8]:
small_matrix.download()
Out[8]:
('C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz',
 'C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz')

The download methods supports the following arguments:

Argument Description Data type Default value Notes
format Sparse matrix storage format One of ('MM', 'RB', 'MAT') MM MM is Matrix Market, RB is Rutherford-Boeing and MAT is MATLAB MAT-file format
destpath Path to download str ~/.ssgetpy on Unix %APPDATA%\ssgetpy on Windows The full filename for the matrix is obtained via os.path.join(destpath, format, group_name, matrix_name + extension)where extention is .tar.gz for MM and RB and .mat for MAT
extract Extract TGZ archive? bool False Only applicable to MM and RB formats

The return value is a two-element tuple containing the local path where the matrix was downloaded to along with the path for the extracted file, if applicable.

Note that download does not actually download the file again if it already exists in the path.

In [9]:
small_matrix.download()
Out[9]:
('C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz',
 'C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz')
In [10]:
small_matrix.download(extract=True)
Out[10]:
('C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02',
 'C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz')

Finally, download also works directly on the output of search, so you don't have to download one matrix at a time. For example, to download the first five matrices in the previous query, you could use:

In [11]:
result[:5].download()

In [ ]: