Pure Data Analysis¶

This tutorial covers different methods of analysing data without running GST. So far, there's only one, which checks for consistency between two (or more) datasets, called "Data Set Comparison".

Data Set Comparison¶

This method declares that two or more DataSets are "inconsistent" if the observed counts for the same operation sequences across the data sets are inconsistent with being generated by the same underlying model. This protocol can be used to test for, among other things, drift and crosstalk. It can also be used to compare an experimental dataset to an "ideal" dataset. The methods in this tutorial have been presented in "Probing context-dependent errors in quantum processors" by Rudinger et al.

Here we demonstrate the tool on simulated data from models with Markovian errors. But this protocol can also be used regardless of the underlying error type.

In [1]:

from __future__ import division, print_function

import pygsti
import numpy as np
import scipy
from scipy import stats
from pygsti.construction import std1Q_XYI

Let's first compare two Dataset objects where the underlying models are the same. The data sets we'll use will be GST datasets (which allows us to do some nice visualization), but arbitrary datasets will work in general, provided that the operation sequences across the datasets are the same.

In [2]:

#Let's make our underlying model have a little bit of random unitary noise.
mdl_exp_0 = std1Q_XYI.target_model()
mdl_exp_0 = mdl_exp_0.randomize_with_unitary(.01,seed=0)

In [3]:

germs = std1Q_XYI.germs
fiducials = std1Q_XYI.fiducials
max_lengths = [1,2,4,8,16,32,64,128,256]
gate_sequences = pygsti.construction.make_lsgst_experiment_list(std1Q_XYI.gates,fiducials,fiducials,germs,max_lengths)

In [4]:

#Generate the data for the two datasets, using the same model, with 100 repetitions of each sequence.
N=100
DS_0 = pygsti.construction.generate_fake_data(mdl_exp_0,gate_sequences,N,'binomial',seed=10)
DS_1 = pygsti.construction.generate_fake_data(mdl_exp_0,gate_sequences,N,'binomial',seed=20)

In [5]:

#Let's compare the two datasets.
comparator_0_1 = pygsti.objects.DataComparator([DS_0,DS_1])
comparator_0_1.implement(significance=0.05)

Statistical hypothesis tests did NOT find inconsistency between the datasets at 5.00% significance.

In [6]:

#Create a workspace to show plots
w = pygsti.report.Workspace()
w.init_notebook_mode(connected=False, autodisplay=True) 

Loading...

In [7]:

#As we expect, the datasets are consistent!
#We can also visualize this in a few ways:

#This is will show a histogram of the p-values associated with the different strings.
#If the null hypothesis (that the underlying models are the same) is true,
#then we expect the distribution to roughly follow the dotted green line.
w.DatasetComparisonHistogramPlot(comparator_0_1, log=True, display='pvalue')

Out[7]:

<pygsti.report.workspaceplots.DatasetComparisonHistogramPlot at 0x11b0a99e8>

In [8]:

#Color box plot comparing two datasets from same model
gssList = pygsti.construction.make_lsgst_structs(std1Q_XYI.gates, fiducials, fiducials, germs, max_lengths)
w.ColorBoxPlot('dscmp', gssList[-1], None, None, dscomparator=comparator_0_1)
  #A lack of green boxes indicates consistency between datasets!

Out[8]:

<pygsti.report.workspaceplots.ColorBoxPlot at 0x11c928978>

In [9]:

#Now let's generate data from two similar but not identical datasets and see if our tests can detect them.

In [10]:

mdl_exp_1 = std1Q_XYI.target_model()
mdl_exp_1 = mdl_exp_1.randomize_with_unitary(.01,seed=1)

In [11]:

DS_2 = pygsti.construction.generate_fake_data(mdl_exp_1,gate_sequences,N,'binomial',seed=30)

In [12]:

#Let's make the comparator and get the report.
comparator_1_2 = pygsti.objects.DataComparator([DS_1,DS_2])
comparator_1_2.implement(significance=0.05)

The datasets are INCONSISTENT at 5.00% significance.
  - Details:
    - The aggregate log-likelihood ratio test is significant at 629.10 standard deviations.
    - The aggregate log-likelihood ratio test standard deviations signficance threshold is 1.98
    - The number of sequences with data that is inconsistent is 718
    - The maximum SSTVD over all sequences is 1.00
    - The maximum SSTVD was observed for Qubit * ---|Gx|-|Gx|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|---

In [13]:

#The datasets are significantly inconsistent!  Let's see what the distribution of p-values looks like now:
w.DatasetComparisonHistogramPlot(comparator_1_2)

Out[13]:

<pygsti.report.workspaceplots.DatasetComparisonHistogramPlot at 0x11b0a99b0>

In [14]:

w.ColorBoxPlot('dscmp', gssList[-1], None, None, dscomparator=comparator_1_2)
#The colored boxes indicate inconsistency between datasets!

Out[14]:

<pygsti.report.workspaceplots.ColorBoxPlot at 0x11cfeb5f8>

If you'd like to extract the various quantities calculated by the DataComparator, use the .get_ methods. There are methods for extracting all of the quantities discussed in "Probing context-dependent errors in quantum processors" by Rudinger et al. For example, below we extract the Jensen-Shannon divergence for a particular circuit:

In [15]:

opstr = DS_1.keys()[2000]
print(opstr.display_str(80))
print(comparator_1_2.get_JSD(opstr))

Qubit * ---|Gx|-|Gx|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|

 >>> -|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|

 >>> -|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|

 >>> -|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|

 >>> -|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gi|-|Gi|-|Gx|-|Gx|-|Gx|---


0.19430446226785072

In [ ]: