Introduction to TACO¶

TACO, which stands for Tensor Algebra Compiler, is a library for performing sparse and dense linear algebra and tensor algebra computations.

The computations can range from relatively simple ones like sparse matrix-vector multiplication to more complex ones like matricized tensor times Khatri-Rao product. All these computations can be performed on any mix of dense and sparse tensors. Under the hood, TACO automatically generates efficient code to perform these computations.

This notebook provides a brief introduction to the TACO Python library. For a more comprehensive overview, please see the documentation linked here. We will also link to relevant pages as we progress.

Getting Started ¶

First, let's import TACO. Press Shift + Enter to run the code below.

In [ ]:

import pytaco as pt
from pytaco import dense, compressed

In the above, dense and compressed are mode (dimension) formats. We can think of tensors as multi-dimensional arrays, and the mode formats allow us to specify how we would like to store the data in each dimension:

If a dimension is dense, then all of the elements in that dimension are stored.
And if a dimension is compressed, then only nonzeros are stored.

For example, we can declare a $512 \times 64 \times 2048$ tensor whose first dimension is dense and second and third dimensions are compressed:

In [ ]:

T = pt.tensor([512, 64, 2048], pt.format([dense, compressed, compressed]))

We can initialize $T$ by calling its insert method to add a nonzero element to the tensor. The insert method takes two arguments: a list of coordinates and the value to be inserted at those coordinates:

In [ ]:

# Set T(0, 1, 0) = 42.0
T.insert([0, 1, 0], 42.0)

If multiple elements are inserted at the same coordinates, they are summed together:

In [ ]:

# Set T(0, 0, 1) = 12.0 + 24.0 = 36.0
T.insert([0, 0, 1], 12.0)
T.insert([0, 0, 1], 24.0)

We can then iterate over the nonzero elements of the tensor as follows:

In [ ]:

for coordinates, value in T: 
    print("Coordinates: {}, Value: {}".format(coordinates, value))

Defining Tensor Formats ¶

Consider a matrix $M$ (aka a two-dimensional tensor) containing the following values:

$$M = \begin{bmatrix} 6 & \cdot & 9 & 8 \\ \cdot & \cdot & \cdot & \cdot \\ 5 & \cdot & \cdot & 7 \end{bmatrix}$$

Denote the rows and columns as dimensions $d_1$ and $d_2$, respectively.

We look at how $M$ is represented differently in different formats. For convenience, let's define a helper function to initialize $M$.

In [ ]:

def make_example_matrix(format):
    M = pt.tensor([3, 4], format)
    M.insert([0, 0], 6)
    M.insert([0, 2], 9)
    M.insert([0, 3], 8)
    M.insert([2, 0], 5)
    M.insert([2, 3], 7)
    return M

(dense $d_1$, dense $d_2$)¶

Note that passing in dense makes all of the dimensions dense. This is equivalent to pt.format([dense, dense]).

In [ ]:

make_example_matrix(dense)

For this example, we focus on the last line of the output, the vals array: since all values are stored, it is a flattened $3 \times 4$ matrix stored in row-major order.

(dense $d_1$, compressed $d_2$)¶

This is called compressed sparse row (CSR) format.

In [ ]:

csr = pt.format([dense, compressed])

In [ ]:

make_example_matrix(csr)

Since $d_1$ is dense, we need only store the size of the dimension; values, both zero and nonzero, are stored for every coordinate in dimension $d_1$.

Since $d_2$ is compressed, we store a pos array ([0, 3, 3, 5]) and an idx array ([0, 2, 3, 0, 3]); these together form a segmented vector with one segment per entry in the previous dimension. The idx array stores all the indices with nonzero values in the dimension, while the pos array stores the location in the idx array where each segment begins. In particular, segment $i$ is stored in locations pos[i]:pos[i+1] in the idx array.

The below animation visualizes the format. Hover over any non-empty entry of the matrix on the left to see how the value is stored.

In [ ]:

%run ./animation/animation_2.py

(dense $d_2$, compressed $d_1$)¶

We switch the order of the dimensions by passing in [1, 0] for the optional parameter mode_ordering. This results in a column-major (rather than row-major) format called compressed sparse column (CSC).

In [ ]:

csc = pt.format([dense, compressed], [1, 0])

In [ ]:

make_example_matrix(csc)

In this format, $d_2$ has only size, while $d_1$ has a pos and idx array.

In [ ]:

%run ./animation/animation_3.py

(compressed $d_1$, compressed $d_2$, compressed $d_3$)¶

To more clearly visualize the compressed sparse fiber (CSF) format, where all dimensions are sparse, we move to three dimensions. The tensor $N$ defined below is a $2 \times 3 \times 4$ tensor.

Similarly as above, passing in compressed is equivalent to pt.format([compressed, compressed]).

In [ ]:

N = pt.tensor([2, 3, 4], compressed)

# First layer.
N.insert([0, 0, 0], 6)
N.insert([0, 0, 2], 9)
N.insert([0, 0, 3], 8)
N.insert([0, 2, 0], 5)
N.insert([0, 2, 3], 7)

# Second layer. 
N.insert([1, 0, 1], 2)
N.insert([1, 1, 0], 3)
N.insert([1, 2, 2], 6)
N.insert([1, 1, 3], 1)

In [ ]:

This animation represents $N$ as two $3 \times 4$ matrices.

In [ ]:

%run ./animation/animation_4.py

NumPy and SciPy I/O ¶

We can also initialize tensors with NumPy arrays or SciPy sparse (CSR or CSC) matrices. Let's start by importing the packages we need.

In [ ]:

import numpy as np
import scipy as sp

Given a NumPy array such as the one randomly generated below, we can convert it to a TACO tensor using the pytaco.from_array function, which creates a dense tensor.

In [ ]:

array = np.random.uniform(size=10)
tensor = pt.from_array(array)

In [ ]:

tensor

We can also export TACO tensors to NumPy arrays.

In [ ]:

tensor.to_array()

Similarly, given a SciPy sparse matrix, we can convert it to a TACO tensor. For a CSR matrix like the one below, we use the pytaco.from_sp_csr function, which creates a CSR tensor.

In [ ]:

size = 100
density = 0.1
sparse_matrix = sp.sparse.rand(size, size, density, format = 'csr')

In [ ]:

A = pt.from_sp_csr(sparse_matrix)

And we can export the tensor to a SciPy sparse matrix.

In [ ]:

A.to_sp_csr()

Example Application: SpMV ¶

The following example demonstrates how computations on tensors are performed using TACO.

Sparse matrix-vector multiplication (SpMV) is a bottleneck computation in many scientific and engineering computations. Mathematically, SpMV can be expressed as $$y = Ax + z,$$

where $A$ is a sparse matrix and $x$, $y$, and $z$ are dense vectors. The computation can also be expressed in index notation as

$$y_i = A_{ij} \cdot x_j + z_i.$$

Starting with the $A$ generated above, we can view its shape attribute, which we expect to be [size, size]:

In [ ]:

A.shape

Examining the formula, we need to define a vector $x$ whose length is the number of columns of $A$, and a vector $z$ whose length is the number of rows. We generate $x$ and $z$ randomly with NumPy.

In [ ]:

x = pt.from_array(np.random.uniform(size=A.shape[1]))
z = pt.from_array(np.random.uniform(size=A.shape[0]))

Expressing the Computation¶

We can express the result $y$ as a dense vector.

In [ ]:

y = pt.tensor([A.shape[0]], dense)

The syntax for TACO computations closely mirrors index notation, with the caveat that we also have to explicitly declare the index variables beforehand:

In [ ]:

i, j = pt.get_index_vars(2)

In [ ]:

y[i] = A[i, j] * x[j] + z[i]

Performing the Computation¶

Once a tensor algebra computation has been defined, we can simply invoke the result tensor's evaluate method to perform the actual computation.

Under the hood, TACO will first invoke the result tensor's compile method to generate code that performs the computation. TACO will then perform the actual computation by first invoking assemble to compute the sparsity structure of the result and subsequently invoking compute to compute the values of the result's nonzero elements.

In [ ]:

y.evaluate()

If we define a computation and then access the result without first manually invoking evaluate or compile/assemble/compute, TACO will automatically invoke the computation immediately before the result is accessed.

Finally, we can display the result.

In [ ]: