Notebook

Tutorial on operators (SPAM, gate, and layer operations)¶

This tutorial explains the objects that represent state preparation, measurement, gate, and layer operations in pyGSTi. These objects form the essential components of Model objects in pyGSTi, and are therefore an important topic to understand if you're creating your own models or just need to know how to extract specific information from a Model. We use the term operator generically all such objects, even when gate or layer operators act on vectorized density matrices and are therefore super-operators.

State preparations and POVM effects are represented as vectors in pyGSTi. For $n$ qubits, these can be either length-$2^n$ complex vectors representing pure states/projections or length-$4^n$ real vectors representing mixed states (in the Liouville picture, where we vectorize a $2^n\times 2^n$ density matrix into a column or row vector). Gate and layer operations are represented as linear maps on the space of state vectors. As such these can be viewed as $2^n\times 2^n$ complex matrices (in the pure-state case) or $4^n \times 4^n$ real matrices (in the mixed-state case).

State and effect vectors are subclasses of SPAMVec in pyGSTi. In both cases the vector is stored as a column vector even though effect (co-)vectors are perhaps more properly row vectors (this improves code reuse). Gate and layer operator objects are subclasses of LinearOperator. Together SPAMVec and LinearOperator, which are both derived from ModelMember, form the base for all of pyGSTi's model components. All ModelMember objects have a dimension given by their dim attribute, which for $n$ qubits is $2^n$ or $4^n$ (depending on whether pure or mixed state evolution is being considered).

Let's begin with some familiar imports.

In [1]:

import pygsti
import pygsti.objects as po
import numpy as np

Before getting into the pyGSTi objects, let's generate some example SPAM vectors and gate matrix. These are just NumPy arrays, and we use the stdmx_to_ppvec function to convert a standard $2^n \times 2^n$ complex Hermitian densiy matrix to a length $4^n$ "SPAM vector" of real numbers giving the decomposition of this density matrix in the Pauli basis. The gate_mx describes how a 1-qubit $X(\pi/2)$ rotation transforms a state vector in the Pauli basis.

In [2]:

gate_mx = np.array([[1,   0,   0,   0],
                    [0,   1,   0,   0],
                    [0,   0,   0,  -1],
                    [0,   0,   1,   0]],'d')
density_mx0 = np.array([[1, 0],
                        [0, 0]], complex)
density_mx1 = np.array([[0, 0],
                        [0, 1]], complex)
spam_vec0 = pygsti.tools.stdmx_to_ppvec(density_mx0)
spam_vec1 = pygsti.tools.stdmx_to_ppvec(density_mx1)

print(spam_vec0) # just a numpy column vector 
print(spam_vec0.dtype) # of *real* numbers

Dense operators¶

The simplest kind of operators look very similar to numpy arrays (like those we created above) in which some of the elements are read-only. These operators derive from DenseOperator or DenseSPAMVec and hold a dense representation, meaning the a dense vector or matrix is stored in memory. SPAM, gate, and layer operators have parameters which describe how they can be varied, essentially the "knobs" which you can turn. Model objects also have parameters that are essentially inherited from their contained operators. How an operator is parameterized is particularly relevant for protocols which optimize a Model over its parameter space (e.g. Gate Set Tomography). Three common parameterizations are:

static: the object has no (zero) parameters, so the object cannot be changed at all. Static operators are like read-only NumPy arrays.
full: the object has one independent parameter for each element of its (dense) vector or matrix. Fully parameterized objects are like normal NumPy arrays.
trace-preserving (TP): similar to full, except the top row of gate/layer matrices and the first element of state preparation vectors is fixed and these elements are therefore not parameters. (A POVM that is trace preserving must have all of its effect vectors sum to the identity.)

Here's a 1-qubit example of creating dense-operator objects:

In [3]:

#Prep vectors
tpSV = po.TPSPAMVec(spam_vec0)
staticSV = po.StaticSPAMVec(spam_vec0, typ="prep")

#Operations
staticOp = po.StaticDenseOp(gate_mx)
fullOp = po.FullDenseOp(gate_mx)
tpOp = po.TPDenseOp(gate_mx)

#Effect vectors
staticEV = po.StaticSPAMVec(spam_vec0, typ="effect")
fullEV = po.FullSPAMVec(spam_vec0, typ="effect")

#POVMs
povm = po.UnconstrainedPOVM( {'outcomeA': staticEV, 'outcomeB':fullEV} )
tppovm = po.TPPOVM( {'0': spam_vec0, '1': spam_vec1} )

for op in (tpSV,staticSV,staticOp,fullOp,tpOp,staticEV,fullEV,povm,tppovm):
    print("%s object has %d parameters" % (str(type(op)), op.num_params()))

Although there are certain exceptions, the usual way you set the value of a SPAMVec or LinearOperator object is by setting the values of its parameters. Parameters must be real-valued and are typically allowed to range over all real numbers, so updating an operator's parameter-values is accomplished by passing a real-valued NumPy array of parameter values - a parameter vector - to the operators from_vector method. Note that the length of the parameter vector must match the operator's number of parameters (returned by num_params as demonstrated above).

We'll now set new parameter values for several of the operators we created above. Since for dense operators there's a direct correspondence between parameters and matrix or vector elements, the parameter vector may be a flattened version of a 2d array of the parameterized element values.

In [4]:

new_vec = np.array([1/np.sqrt(2),0,0],'d')
tpSV.from_vector(new_vec)
print("params = ",tpSV.to_vector())
print(tpSV)

new_mx = np.array([[1,   0,   0,   0],
                   [0,   1,   0,   0],
                   [0,   0,   0,-0.9],
                   [0,   0, 0.9,   0]],'d')
fullOp.from_vector(new_mx.flatten())
print("params = ",fullOp.to_vector())
print(fullOp)

new_mx = np.array([[0,   1,   0,   0],
                   [0,   0,   0,-0.9],
                   [0,   0, 0.9,   0]],'d')
tpOp.from_vector(new_mx.flatten())
print("params = ",tpOp.to_vector())
print(tpOp)


new_vec = np.array([1/np.sqrt(2),1/np.sqrt(2),0,0],'d')
fullEV.from_vector(new_vec)
print("params = ",fullEV.to_vector())
print(fullEV)

new_effect = np.array([1/np.sqrt(2),0.9*1/np.sqrt(2),0,0],'d')
tppovm.from_vector(new_effect)
print("params = ",tppovm.to_vector())
print(tppovm)

Lindblad (CPTP-constrained) operations¶

That a gate or layer operation is completely-positive and trace-preserving (CPTP) can be guaranteed if the operation is given by $\hat{O} = \exp{\mathcal{L}}$ where $\mathcal{L}$ takes the Lindblad form: $$\mathcal{L}: \rho \rightarrow \sum_i -i\lambda_i[\rho,B_i] + \sum_{ij} \eta_{ij} \left( B_i \rho B_j^\dagger - \frac{1}{2}\left\{ B_i^\dagger B_j, \rho \right\} \right) $$ where $B_i$ range over the non-identity elements of the ($n$-qubit) Pauli basis, $\lambda_i$ is real, and $\eta \ge 0$ (i.e. the matrix $\eta_{ij}$ is Hermitian and positive definite). We call the $\lambda_i$ terms Hamiltonian error terms, and the (real) $\lambda_i$s the error rates or error coefficients. Likewise, the $\eta_{ij}$ terms are referred to generically as non-Hamiltonian error terms. In the special case where the $\eta$ matrix is diagonal, the terms are called Pauli stochastic error terms and the (real) $\eta_{ii} > 0$ are error rates. Technical note: While all maps of the above form ($\hat{O}$) are CPTP, not all CPTP maps are of this form. $\hat{O}$ is the form of all infinitesimally-generated CPTP maps.

The LindbladOp class represents an operation $e^{\mathcal{L}} U_0$, where $U_0$ is a unitary (super-)operator and $\mathcal{L}$ takes the Lindblad form given above. LindbladOp objects contains a LindbladErrorgen object that encapsulates the Lindbladian exponent $\mathcal{L}$. Lindblad operators are among the most complicated of all the operators in pyGSTi, so bear with us as we try to present things in an organized and comprehensible way.

Let's start by making a LindbladOp from a dense gate matrix:

In [5]:

cptpOp = po.LindbladOp.from_operation_matrix(gate_mx)

A LindbladOp does not hold a dense representation of its process matrix (it's not a DenseOperator), and so you cannot access it like a Numpy array. If you want a dense representation, you can either use a LindbladDenseOp, which is essentially a dense version of a LindbladOp, or you can call the todense() method (which works on dense operators too!):

In [6]:

print(cptpOp)
print("dense representation = ")
pygsti.tools.print_mx(cptpOp.todense()) # see this equals `gate_mx`

Now let's look at the parameters of cptpOp. By default, the $\mathcal{L}$ of a LindbladErrorgen is parameterized such that $\eta \ge 0$ and the LindbladOp map is CPTP. There are several other ways a $\mathcal{L}$ can be parameterized, and these are specified by the values of the nonham_mode and param_mode arguments of construction functions like from_operation_matrix we used above. Here's a quick rundown on these options:

The value of nonham_mode dictates what elements of $\eta$ are allowed to be nonzero. Allowed values are:

"all": all $\eta_{ij}$ elements can vary, possibly with constraints (see value of param_mode).
"diagonal" : $\eta$ is diagonal; $\eta_{ij} = 0$ when $i \ne j$.
"diag_affine": a special mode where we keep track of the diagonal elements of $\eta$ and a set of affine-error coefficients.

The value of param_mode determines how the non-zero parts of $\eta$ relate to the parameters of the LindbladErrorgen object. Allowed values are:

"cptp": the default, which constrains $\mathcal{L}$ so that $e^{\mathcal{L}}$ is guaranteed to be CPTP. When nonham_mode is set to "all" the parameters consist of 1) the $\lambda_i$ and 2) the real and imaginary parts of $M$, the Cholesky decomposition of $\eta$ ($\eta = M M^\dagger$). The reason we can't just use the real and imaginary parts of $\eta_{ij}$ as parameters is that varying (without bounds, as many optimizers require) the real and imaginary parts of $\eta_{ij}$ would not constrain $\eta \ge 0$. If "nonham_mode" is "diagonal" or "diag_affine" then the mapping is simpler and the parameters are just the squares of the the $\eta_{ii}$ (and affine coefficients, in the "diag_affine" case, are unconstrained).
"depol: works only when "nonham_mode" is "diagonal" or "diag_affine", and associates all the $\eta_{ii}^2$ values to the same parameter, which is the square of the depolarization rate.
"unconstrained": parameters are exactly the $\lambda_i$ and $\eta_{ij}$ (separated into real and imaginary parts for off-diagonal $\eta_{ij}$). This places no constrains on the positivity of $\eta$.

If that didn't all make sense, don't worry. We'll be working with just the default case where nonham_mode = "all" and param_mode = "cptp". This gave our single-qubit cptpOp operator $12$ parameters: $3$ "Hamiltonian" (X,Y, and Z) + $9$ "non-Hamiltonian" (real and imaginary parts of the lower-triangular $3 \times 3$ matrix $M$).

Let's get these parameters using cptpOp.to_vector() and print them:

In [7]:

print("params (%d) = " % cptpOp.num_params(),cptpOp.to_vector(),'\n')

All 12 parameters are essentially 0 because gate_mx represents a (unitary) $X(\pi/2)$ rotation and $U_0$ is automatically set to this unitary so that $\exp\mathcal{L} = \mathbb{1}$. This means that all the error coefficients are zero, and this translates into all the parameters being zero. Note, however, that error coefficients are not always the same as parameters. The get_errgen_coeffs retrieves the error coefficients, which is often more useful than the raw parameter values:

In [8]:

import pprint
coeff_dict, basis = cptpOp.get_errgen_coeffs(return_basis=True)
print("Coefficients in (<type>,<basis_labels>) : value form:"); pprint.pprint(coeff_dict)
print("\nBasis containing elements:"); pprint.pprint(basis.labels)

get_errgen_coeffs returns a dictionary and a basis (if return_basis=True). The dictionary maps a shorthand description of the error term to value of the term's coefficient (rate). This shorthand description is a tuple starting with "H", "S", or "A" to indicate the type of error term: Hamiltonian, non-Hamiltonian/stochastic, or affine. Additional elements in the tuple are basis-element labels (often strings of I, X, Y, and Z), which reference basis matrices in the Basis that is returned when return_basis=True. Hamiltonian ("H") errors are described by a single basis element (the single index of $\lambda_i$). The non-Hamiltonian ("S") errors in the Lindblad expansion can be described either by a single basis element (indicating a diagonal element $\eta_{ii}$, also referred to as a stochastic error rate) or by two basis elements (the two indices of $\eta_{ij}$). The affine ("A") errors correspond to particular linear combinations of the non-Hamiltonian errors, and can only be used in conjuction with diagonal non-Hamiltonian errors.

If we write the above expansion for $\mathcal{L}$ compactly as $\mathcal{L} = \sum_i \lambda_i F_i + \sum_{ij} \eta_{ij}F_{ij}$, then the elements of the error-coefficient dictionary returned from get_errgen_coeffs correspond to:

Hamiltonian (e.g. ("H","X")) elements specify $\lambda_i$, the coefficients of $$F_i : \rho \rightarrow -i[B_i,\rho]$$
Stochastic, i.e. diagonal non-Hamiltonian, elements (e.g. ("S","X")) specify $\eta_{ii}$, the coefficients of $$F_{ii} : \rho \rightarrow B_i \rho B_i - \rho$$
Off-diagonal non-Hamiltonian elements (e.g. ("S","X","Y")) specify $\eta_{ij}$, the potentially complex coefficients of: $$F_{ij} : \rho \rightarrow B_i \rho B_j^\dagger - \frac{1}{2}\left\{ B_i^\dagger B_j, \rho \right\}$$
Affine elements (e.g. ("A","X")) specify the coefficients of: $$A_i : \rho \rightarrow \mathrm{Tr}(\rho_{target})B_i \otimes \rho_{non-target}.$$ Here the target vs non-target parts of $\rho$ refer to the qubits on $B_i$ is nontrivial and trivial respectively (e.g. in ("A","IXI") the second qubit is the "target" space and qubits 1 and 3 are the "non-target" space). This $A_i$ can be written as a linear combination of the $F_{ij}$. Because the map $\rho \rightarrow I$ can be written as $\rho \rightarrow \frac{1}{d^2} \sum_i B_i \rho B_i$, where the sum loops over all the Paulis (including the identity), this means that a map like $\rho \rightarrow B_k$ can be written as $\rho \rightarrow \frac{1}{d^2} \sum_i B_i B_k \rho B_i B_k$, which can be expressed as a sum of the $F_{ij}$ terms.

The set_errgen_coeffs function does the reverse of get_errgen_coeffs: it sets the values of a LindbladOp's parameters based on a description of the errors in terms of the above-described dictionary and a Basis object that can resolve the basis labels.

Finally, we can also initialize a LindbladErrorgen using a dictionary in this format. Below we construct a LindbladErrorgen with $$\mathcal{L} = 0.1 H_X + 0.1 S_X$$ where $H_X: \rho \rightarrow -i[\rho,X]$ and $S_X: \rho \rightarrow X\rho X - \rho$ are Hamiltonian and Pauli-stochastic errors, respectively. We then use this error generator to initialize a LindbladOp corresponding to $e^{\mathcal{L}}U_0$, where $U_0$ is a $X(\pi/2)$ rotation.

In [9]:

sigmax = 1/np.sqrt(2) * np.array( [[0,1],
                                   [1,0]], 'd')
errorgen = po.LindbladErrorgen(dim=4,
                               Ltermdict={('H','X'): 0.1, ('S','X','X'): 0.1},
                               basis=pygsti.obj.Basis.cast('pp',4))
cptpOp2 = po.LindbladOp(staticOp, errorgen)
print(cptpOp2)
pygsti.tools.print_mx(cptpOp2.todense())

We can check that this LindbladOp has the right error generator coefficients. This time we do things slightly differently by accessing the errorgen member of the operator of the LindbladOp:

In [10]:

cptpOp2.errorgen.get_coeffs() # same as cptpOp2.get_errgen_coeffs()

An inconvenience arises because an error generator is exponentiated to form a map. This means that the coefficients of the stochastic generators don't exactly correspond to error rates of the final map as they're often thought of. Take a simple example where we want to construct a depolarizing map with a process fidelity of 90%. One might think that this would achieve that:

In [11]:

test_depol_op = po.LindbladOp(po.StaticDenseOp(np.identity(4)), 
                              po.LindbladErrorgen(dim=4,
                               Ltermdict={('S','X'): 0.1/3, ('S','Y'): 0.1/3, ('S','Z'): 0.1/3},
                               basis=pygsti.obj.Basis.cast('pp',4), nonham_mode="diagonal"))
pygsti.tools.entanglement_fidelity(test_depol_op.todense(), np.identity(4))

But as we see, the fidelity isn't quite $0.9$. This is because the error rate of the map whose error generator has all stochastic-term coefficients equal to $C$ is $(1 - e^{-d^2 C}) / d^2$, not just $C$. This transformation is accounted for in the get_error_rates and set_error_rates Lindblad operator methods, which behave just like get_errgen_coeffs and set_errgen_coeffs except that they internally transform the non-Hamiltonian values between coefficients and (map) error rates. Our simple example is fixed by using set_error_rates:

In [12]:

test_depol_op.set_error_rates({('S','X'): 0.1/3, ('S','Y'): 0.1/3, ('S','Z'): 0.1/3})
pygsti.tools.entanglement_fidelity(test_depol_op.todense(), np.identity(4))

Lindblad state preparations and POVMs¶

It is possible to create "Lindblad" state preparations and POVMs using the LindbladSPAMVec and LindbladPOVM classes. These simply compose a $\exp\mathcal{L}$ factor (from a LindbladErrorgen or LindbladOp) with an existing "base" state preparation or POVM. That is, state preparations are $e^{\mathcal{L}} |\rho_0\rangle\rangle$, where $|\rho_0\rangle\rangle$ represents a "base" pure state, and effect vectors are $\langle\langle E_i | e^{\mathcal{L}}$ where $\langle\langle E_i|$ are the effects of a "base" POVM.

In [13]:

#Spam vectors and POVM
cptpSpamVec = po.LindbladSPAMVec(staticSV, errorgen, 'prep') # staticSV is the "base" state preparation
cptpPOVM = po.LindbladPOVM(po.LindbladOp(None, errorgen)) # by default uses the computational-basis POVM

Embedded and Composed operations¶

PyGSTi makes it possible to build up "large" (e.g. complex or many-qubit) operators from other "smaller" ones. We have already seen a modest example of this above when a LindbladOp was constructed from a LindbladErrorgen object. Two common and useful actions for building large operators are:

Composition: A ComposedOp composes zero or more other operators, and therefore it's action is the sequential application of each of its factors. The dense representation of a ComposedOp is equal to the product (in reversed order!) of the dense representations of its factors. Note that a ComposedOp does not, however, require that its factors have dense representations - they can be any LinearOperator objects. Note, finally, that there exists a dense version of ComposedOp called ComposedDenseOp. The dense versions of operators can sometimes result in faster calculations when the system size (qubit number) is small.
Embedding: An EmbeddedOp maps an operator on a subsystem of a state space to the full state space. For example, it could take a 1-qubit $X(\pi/2)$ rotation and make a 3-qubit operation in which this operation is applied to the 2nd qubit. Embedded operators are very useful for constructing layer operations in multi-qubit models, where we naturally prefer to work with the lower-dimensional (typically 1- and 2-qubit) operations and need to build up $n$-qubit layer operations.

Composed operations¶

We'll being by creating an operation that composes several of the dense operations we made earlier:

In [14]:

composedOp = po.ComposedOp((staticOp,tpOp,fullOp))
print(composedOp)
print("Before interacting w/Model:",composedOp.num_params(),"params")

This all looks good except we may have expected that composedOp.num_params() would be $0+12+16=28$ (the sum of the parameter-counts of the factors). What's going on? To get the ComposedOp to "realize" how many parameters it has we'll need to add it to a Model and call the model's .num_params() to refresh the model's number of parameters.

Technical note: Model objects are actually seen as owning the parameters, and more advanced operators like ComposedOp and EmbeddedOp must interact with a model before their parameters are allocated.

We'll create a dummy Model object named mdl, and after we add composedOp to mdl and call mdl.num_params() things work like we expect them to:

In [15]:

mdl = po.ExplicitOpModel(['Q0']) # create a single-qubit model
mdl.operations['Gcomposed'] = composedOp
mdl.num_params()

print("After: interacting w/Model:",composedOp.num_params(),"params")
for i,op in enumerate(composedOp.factorops):
    print("Factor %d (%s) has %d params" % (i,op.__class__.__name__,op.num_params()))

Embedded operations¶

Here's how to embed a single-qubit operator (fullOp, created above) into a 3-qubit state space, and have fullOp act on the second qubit (labelled "Q1"). Note that the parameters of an EmbeddedOp are just those of the underlying operator (the one that has been embedded).

In [16]:

embeddedOp = po.EmbeddedOp(['Q0','Q1','Q2'],['Q1'],fullOp)
print(embeddedOp)
print("Dimension =",embeddedOp.dim, "(%d qubits!)" % (np.log2(embeddedOp.dim)/2))
print("Number of parameters =",embeddedOp.num_params())

Better together¶

We can design even more complex operations using combinations of composed and embedded objects. For example, here's a 3-qubit operation that performs three separate 1-qubit operations (staticOp, fullOp, and tpOp) on each of the three qubits. (These three operations happen to all be $X(\pi/2)$ gates because we're lazy and didn't bother to use gate_mx values in our examples above, but they could be entirely different.) The resulting combinedOp might represent a layer in which all three gates occur simultaneously.

In [17]:

# use together
mdl_3Q = po.ExplicitOpModel(['Q0','Q1','Q2'])
combinedOp = po.ComposedOp( (po.EmbeddedOp(['Q0','Q1','Q2'],['Q0'],staticOp),
                             po.EmbeddedOp(['Q0','Q1','Q2'],['Q1'],fullOp),
                             po.EmbeddedOp(['Q0','Q1','Q2'],['Q2'],tpOp))
                          )
mdl_3Q.operations[(('Gstatic','Q0'),('Gfull','Q1'),('Gtp','Q2'))] = combinedOp
mdl_3Q.num_params() # to recompute & allocate the model's parametes
print(combinedOp)
print("Number of parameters =",combinedOp.num_params())

More coming (soon?) ...¶

While this tutorial covers the main ones, there're even more model-building-related objects that we haven't had time to cover here. We plan to update this tutorial, making it more comprehensive, in future versions of pyGSTi.

In [ ]: