This notebook walks you through some of the features of the ssgetpy
package that provides a search and download interface for the Suite Sparse matrix collection.
The simplest way to install ssgetpy
is via:
pip install ssgetpy
This installs both the ssgetpy
Python module as well as the ssgetpy
command-line script.
This notebook only covers the library version of ssgetpy
. To get more information on the command-line script run:
$ ssgetpy --help
Before proceeding with the rest of this notebook, please install ssgetpy
into your environment. If you are running this notebook under Binder, ssgetpy
will already be installed for you. If you are running this notebook in Google Colaboratory, the following cell will install ssgetpy
:
ipy = get_ipython()
if 'google.colab' in str(ipy):
import sys
ipy.run_cell('!{sys.executable} -m pip install ssgetpy')
First import ssgetpy
via:
import ssgetpy
The primary interface to ssgetpy
is via ssgetpy.search
. Running search
without any arguments returns the first 10 matrices in the collection:
ssgetpy.search()
Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | HB | 1138_bus | 1138 | 1138 | 4054 | real | No | Yes | 1.0 | 1.0 | power network problem | |
2 | HB | 494_bus | 494 | 494 | 1666 | real | No | Yes | 1.0 | 1.0 | power network problem | |
3 | HB | 662_bus | 662 | 662 | 2474 | real | No | Yes | 1.0 | 1.0 | power network problem | |
4 | HB | 685_bus | 685 | 685 | 3249 | real | No | Yes | 1.0 | 1.0 | power network problem | |
5 | HB | abb313 | 313 | 176 | 1557 | binary | No | No | 0.0 | 0.0 | least squares problem | |
6 | HB | arc130 | 130 | 130 | 1037 | real | Yes | No | 0.76 | 0.0 | materials problem | |
7 | HB | ash219 | 219 | 85 | 438 | binary | No | No | 0.0 | 0.0 | least squares problem | |
8 | HB | ash292 | 292 | 292 | 2208 | binary | No | No | 1.0 | 1.0 | least squares problem | |
9 | HB | ash331 | 331 | 104 | 662 | binary | No | No | 0.0 | 0.0 | least squares problem | |
10 | HB | ash608 | 608 | 188 | 1216 | binary | No | No | 0.0 | 0.0 | least squares problem |
Notice that search result comes with minimal Jupyter integration that shows some metadata along with the distribution of the non-zero values. Click on the group or name link to go a web page in the SuiteSparse matrix collection that has much more information about the group or the matrix respectively.
You can add more filters via keyword arguments as follows:
Argument | Description | Type | Default | Notes |
---|---|---|---|---|
rowbounds |
Number of rows | tuple : (min_value, max_value) |
(None, None) |
min_value or max_value can be None which implies "don't care" |
colbounds |
Number of columns | tuple : (min_value, max_value) |
(None, None) |
|
nzbounds |
Number of non-zeros | tuple : (min_value, max_value) |
(None, None) |
|
isspd |
SPD? | bool or None |
None |
None implies "don't care" |
is2d3d |
2D/3D Discretization? | bool or None |
None |
|
dtype |
Non-zero data type | real , complex , binary or None |
None |
|
group |
Matrix group | str or None |
None |
Supports partial matches; None implies "don't care" |
kind |
Problem domain | str or None |
None |
Supports partial matches; None implies "don't care" |
limit |
Max number of results | int |
10 |
Note that numerical and pattern symmetry filters are not yet supported.
As an example of using the above filters, here is a query that returns five, non-SPD matrices with $1000\leq \text{NNZ} \leq 10000$:
ssgetpy.search(nzbounds=(1000,10000), isspd=False, limit=5)
Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot |
---|---|---|---|---|---|---|---|---|---|---|---|---|
5 | HB | abb313 | 313 | 176 | 1557 | binary | No | No | 0.0 | 0.0 | least squares problem | |
6 | HB | arc130 | 130 | 130 | 1037 | real | Yes | No | 0.76 | 0.0 | materials problem | |
8 | HB | ash292 | 292 | 292 | 2208 | binary | No | No | 1.0 | 1.0 | least squares problem | |
10 | HB | ash608 | 608 | 188 | 1216 | binary | No | No | 0.0 | 0.0 | least squares problem | |
12 | HB | ash958 | 958 | 292 | 1916 | binary | No | No | 0.0 | 0.0 | least squares problem |
The result of a search query is a collection of Matrix
objects. The collection can be sliced using the same syntax as for vanilla Python list
s as shown below:
result = ssgetpy.search(kind='structural', nzbounds=(1000,10000))
result[:4]
Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot |
---|---|---|---|---|---|---|---|---|---|---|---|---|
24 | HB | bcsstk02 | 66 | 66 | 4356 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
26 | HB | bcsstk04 | 132 | 132 | 3648 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
27 | HB | bcsstk05 | 153 | 153 | 2423 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
28 | HB | bcsstk06 | 420 | 420 | 7860 | real | Yes | Yes | 1.0 | 1.0 | structural problem |
An individual element in the collection can be used as follows:
small_matrix = result[0]
small_matrix
small_matrix.nnz
4356
We can download a matrix locally using the download
method:
small_matrix.download()
('C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz', 'C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz')
The download
methods supports the following arguments:
Argument | Description | Data type | Default value | Notes |
---|---|---|---|---|
format |
Sparse matrix storage format | One of ('MM', 'RB', 'MAT' ) |
MM |
MM is Matrix Market, RB is Rutherford-Boeing and MAT is MATLAB MAT-file format |
destpath |
Path to download | str |
~/.ssgetpy on Unix %APPDATA%\ssgetpy on Windows |
The full filename for the matrix is obtained via os.path.join(destpath, format, group_name, matrix_name + extension) where extention is .tar.gz for MM and RB and .mat for MAT |
extract |
Extract TGZ archive? | bool |
False |
Only applicable to MM and RB formats |
The return value is a two-element tuple
containing the local path where the matrix was downloaded to along with the path for the extracted file, if applicable.
Note that download
does not actually download the file again if it already exists in the path.
small_matrix.download()
('C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz', 'C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz')
small_matrix.download(extract=True)
('C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02', 'C:\\Users\\drdar\\AppData\\Roaming\\ssgetpy\\MM\\HB\\bcsstk02.tar.gz')
Finally, download
also works directly on the output of search
, so you don't have to download one matrix at a time. For example, to download the first five matrices in the previous query, you could use:
result[:5].download()
HBox(children=(FloatProgress(value=0.0, description='Overall progress', max=5.0, style=ProgressStyle(descripti…