Notebook

Do the abc in R cuz python abc packages suck¶

https://www.datacamp.com/community/blog/jupyter-notebook-r#gs.CkjgxoE

NB: conda install -c r rpy2=2.8.6 -f rpy2 needs to be a specific version to work in the ipython notebook.

This is the code to do the abc in R, it's useful for debugging.

%%R -i PRIOR,prior_args
cv.res.reg <- cv4abc(data.frame(eq=PRIOR[prior_args]), PRIOR[,7], nval=2, tols=c(.5), method="rejection")
print(summary(cv.res.reg))
#plot(cv.res.reg)
print(cv.res.reg)

In [4]:

%matplotlib inline
%load_ext rpy2.ipython

import os
GIMME_DIR = "/home/isaac/gimmeSAD/"
os.chdir(GIMME_DIR)

import gimmeSAD
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import ipyparallel as ipp
import itertools
import random
import glob
import scipy
import rpy2.interactive as r
import rpy2.interactive.packages
import datetime
import time

from ipyrad.assemble.util import progressbar 
from sklearn.metrics import mean_squared_error
from math import sqrt
from scipy.stats import linregress

GIMME_DIR = "/home/isaac/gimmeSAD/"
os.chdir(GIMME_DIR)
ONED_DIR = GIMME_DIR + "1d_sims/"
TWOD_DIR = GIMME_DIR + "2d_sims/"

## Set up the link to ipcluster and create a loadbalanced view
## Need to run this externally, and wait for a little bit.
## `ipcluster start -n 40 --cluster-id gimmeCV
ipyclient = ipp.Client(cluster_id="gimmeCV")
print(len(ipyclient), 'cores')
thview = ipyclient.load_balanced_view()

The rpy2.ipython extension is already loaded. To reload it, use:
  %reload_ext rpy2.ipython
(40, 'cores')

In [5]:

## A function to return RMSE and R
def get_stats(tr, est):
    tr = np.array(tr).flatten()

    rms = sqrt(mean_squared_error(tr, est))
    r = np.corrcoef(np.vstack((tr, est)))[1,0]

    return rms, r

In [6]:

def call_abc(REF_TABLE, model, param, model_args, NCV=10):
    import pandas as pd
    import rpy2.interactive as r
    import rpy2.interactive.packages
    from rpy2 import robjects
    from rpy2.robjects import pandas2ri
    pandas2ri.activate()

    r.packages.importr("abc")
    abc = r.packages.packages.abc

    ## Read in the prior
    ## Return the prior as a huge df
    PRIOR = pd.read_csv(REF_TABLE, sep="\t")

    cvresults = abc.cv4abc(PRIOR[param], PRIOR[model_args[model]], nval=NCV, tols=robjects.r["c"](0.01, 0.05), method="rejection")
    ## results_dict[model][param] = cvresults
    return cvresults

Gather all the simulations into one huge simout file¶

In [7]:

## Three options
##  ONEPER          - Randomly select only one line per file (one sample from the
##                    entire temporal history of the community).
##  REGULARIZE      - Normalize all pi/dxy bins so they are proportional and not counts.
##                    This normalization routine makes loading the file much slower.
##  DROPEQUILIBSIMS - Remove all the extra simulations at equilibrium because it fucks up the prior.
def gather_simouts(SIMOUT_DIR, oneper=True, regularize=False, subset=False):
    ONEPER = oneper
    REGULARIZE = regularize
    DROPEQUILIBSIMS = True
    ## For making a small dataset for testing so it doesn't take forever
    SUBSET = subset

    filename = "priors.txt"
    if REGULARIZE:
        filename = "normed_priors.txt"
    REF_TABLE = os.path.join(SIMOUT_DIR, filename)
    outfile = open(REF_TABLE, "w")


    files = glob.glob(SIMOUT_DIR + "*/sumstats.txt")
    numsims = 0
    outfile.write(open(files[0]).readlines()[0])

    for f in files:
        lines = open(f).readlines()[1:]
        new_lines = []
        try:
            for line in lines:
                if line.split()[3] == "1" and DROPEQUILIBSIMS:
                    pass
                else:
                    new_lines.append(line)
        except:
            print("Error in file {}".format(f))
        ## Only choose one step per simulation?
        if lines and ONEPER:
            try:
                new_lines = [random.choice(new_lines)]
                new_lines = [x.strip() + "\n" for x in new_lines]
            except:
                ## Sometimes you get a file that's all sims at eq, so this is blank
                continue

        ## DO or don't regularlize the histograms
        if REGULARIZE:
            tmp_lines = []
            for line in new_lines:
                try:
                    hist = np.array([int(x) for x in line.split()[7:]])
                    if np.sum(hist):
                        tot = np.sum(hist)
                        hist = hist/float(tot)
                    tmp = line.split()[:7] + map(str,hist)
                    line = "\t".join(tmp) + "\n"
                    if "NaN" in line:
                        ## Sometimes something screws up so just drop it
                        continue
                    else:
                        tmp_lines.append(line)
                except:
                    continue
            new_lines = tmp_lines
        ## Don't write blanks
        if new_lines:
            numsims += len(new_lines)
            outfile.write("".join(new_lines))
        if SUBSET:
            if numsims > 20000:
                break
    outfile.close()
    print("Processed {} simulations.".format(numsims))
    return REF_TABLE

ONED_PRIOR_INTS = gather_simouts(ONED_DIR, oneper=True, regularize=False, subset=False)
TWOD_PRIOR_INTS = gather_simouts(TWOD_DIR, oneper=True, regularize=False, subset=False)
ONED_PRIOR_NORMED = gather_simouts(ONED_DIR, oneper=True, regularize=True, subset=False)
TWOD_PRIOR_NORMED = gather_simouts(TWOD_DIR, oneper=True, regularize=True, subset=False)

Processed 14402 simulations.
Processed 14400 simulations.
Processed 14402 simulations.
Processed 14400 simulations.

Walk through and perform CV for each parameter using each data configuration.¶

Parallelized with ipcluster so it's much faster¶

In [8]:

## For the impatient, do them all at once forking w/ ipcluster
DO_NORMED = False
if DO_NORMED:
    ONED_PRIOR = ONED_PRIOR_NORMED
    TWOD_PRIOR = TWOD_PRIOR_NORMED
else:
    ONED_PRIOR = ONED_PRIOR_INTS
    TWOD_PRIOR = TWOD_PRIOR_INTS
    
models = ["Ma", "Mi", "Mai", "Mmi", "Mami"]
model_args = {"Ma":["shannon"],\
              "Mi":["bin_"+str(x) for x in range(0,10)],\
              "Mai":["shannon"] + ["bin_"+str(x) for x in range(0,10)],\
              "Mmi":["bin_{}_{}".format(x, y) for x in range(0,10) for y in range(0,10)],\
              "Mami":["shannon"] + ["bin_{}_{}".format(x, y) for x in range(0,10) for y in range(0,10)]}
params = ["K", "c", "%equil", "colrate", "extrate", "shannon"]

## Create the dict for holding results
cv_jobs = {}
alljobs = []
for model in models:
    
    ## Select the right prior for each model
    if "m" in model:
        PRIOR = TWOD_PRIOR
    else:
        PRIOR = ONED_PRIOR

    cv_jobs[model] = {}
    for param in params:
        if "a" in model and param == "shannon":
            print("Skipping H' estimation for model - {}".format(model))
            continue
        ## print("Doing {} {}".format(model, param))
        cv_jobs[model][param] = thview.apply(call_abc, *[PRIOR, model, param, model_args, 100])
        alljobs.append(cv_jobs[model][param])

## Wait for all jobs to finish
start = time.time()
while 1:
    total = len(alljobs)
    ready = [i.ready() for i in alljobs]
    printstr = ' waiting   | {} |'
    elapsed = datetime.timedelta(seconds=int(time.time()-start))
    progressbar(total, sum(ready), printstr.format(elapsed))
    time.sleep(0.1)
    if all(ready):
        break
print("done")

Skipping H' estimation for model - Ma
Skipping H' estimation for model - Mai
Skipping H' estimation for model - Mami
  [####################] 100%  waiting   | 0:09:08 | done

Pull results out of the asyncs¶

Which ones took the longest? Out of curiosity¶

In [9]:

## get results for each model/param combination
results_dict = {}
for model in models:
    results_dict[model] = {}
    for param in params:
        try:
            results_dict[model][param] = cv_jobs[model][param].result()
            print("{}\t{}\t{}\t{}".format(cv_jobs[model][param].elapsed, model, param, type(results_dict[model][param])))
        except Exception as inst:
            print("{} {} {}".format(model, param, inst))

28.44034	Ma	K	<class 'rpy2.robjects.vectors.ListVector'>
29.750973	Ma	c	<class 'rpy2.robjects.vectors.ListVector'>
31.639767	Ma	%equil	<class 'rpy2.robjects.vectors.ListVector'>
28.602762	Ma	colrate	<class 'rpy2.robjects.vectors.ListVector'>
35.279198	Ma	extrate	<class 'rpy2.robjects.vectors.ListVector'>
Ma shannon 'shannon'
131.953651	Mi	K	<class 'rpy2.robjects.vectors.ListVector'>
124.031002	Mi	c	<class 'rpy2.robjects.vectors.ListVector'>
128.322833	Mi	%equil	<class 'rpy2.robjects.vectors.ListVector'>
136.009594	Mi	colrate	<class 'rpy2.robjects.vectors.ListVector'>
131.722548	Mi	extrate	<class 'rpy2.robjects.vectors.ListVector'>
132.030218	Mi	shannon	<class 'rpy2.robjects.vectors.ListVector'>
138.849965	Mai	K	<class 'rpy2.robjects.vectors.ListVector'>
138.695742	Mai	c	<class 'rpy2.robjects.vectors.ListVector'>
140.473577	Mai	%equil	<class 'rpy2.robjects.vectors.ListVector'>
140.725536	Mai	colrate	<class 'rpy2.robjects.vectors.ListVector'>
136.733815	Mai	extrate	<class 'rpy2.robjects.vectors.ListVector'>
Mai shannon 'shannon'
541.832835	Mmi	K	<class 'rpy2.robjects.vectors.ListVector'>
538.43853	Mmi	c	<class 'rpy2.robjects.vectors.ListVector'>
535.762794	Mmi	%equil	<class 'rpy2.robjects.vectors.ListVector'>
537.607167	Mmi	colrate	<class 'rpy2.robjects.vectors.ListVector'>
542.322924	Mmi	extrate	<class 'rpy2.robjects.vectors.ListVector'>
541.016852	Mmi	shannon	<class 'rpy2.robjects.vectors.ListVector'>
547.183046	Mami	K	<class 'rpy2.robjects.vectors.ListVector'>
543.46154	Mami	c	<class 'rpy2.robjects.vectors.ListVector'>
547.542364	Mami	%equil	<class 'rpy2.robjects.vectors.ListVector'>
548.866815	Mami	colrate	<class 'rpy2.robjects.vectors.ListVector'>
546.11972	Mami	extrate	<class 'rpy2.robjects.vectors.ListVector'>
Mami shannon 'shannon'

Do a bunch of nasty rpy2 stuff to get the data out¶

In [10]:

from rpy2.robjects import r, pandas2ri
pandas2ri.activate()

cv_dict = {}
for model in models:
    cv_dict[model] = {}
    for param in params:
        if "a" in model and param == "shannon":
            print("Skipping H' estimation for model - {}".format(model))
            continue
        cv_dict[model][param] = {}
        cvresults = results_dict[model][param]
        df_true = pandas2ri.ri2py(cvresults[cvresults.names.index('true')])
        df_est = pandas2ri.ri2py(cvresults[cvresults.names.index('estim')])
        #df_est = pandas2ri.ri2py(cvresults[cvresults.names.index('')])
        tru = np.array(df_true).flatten()
        est = np.array(df_est).flatten()[:100]
        cv_dict[model][param]["true"] = tru
        cv_dict[model][param]["estim"] = est
        rms, r = get_stats(tru, est)
        cv_dict[model][param]["rms"] = rms
        cv_dict[model][param]["r"] = r

Skipping H' estimation for model - Ma
Skipping H' estimation for model - Mai
Skipping H' estimation for model - Mami

In [11]:

## Print the rms and r2 for each model x param

print("Model\tParam\tRMS\t\tR")
for model in models:
    for param in params:
        try:
            rms = cv_dict[model][param]["rms"]
            r2 = cv_dict[model][param]["r"]**2
            rms = format(rms, ".3f")
            r2 = format(r2, ".3f")
            print("{}\t{}\t{}\t{}".format(model, param, rms, r2))
        except:
            pass
    print("")

Model	Param	RMS		R
Ma	K	2017.611	0.481
Ma	c	0.015	0.003
Ma	%equil	0.176	0.577
Ma	colrate	0.014	0.009
Ma	extrate	0.003	0.327

Mi	K	1073.387	0.855
Mi	c	0.005	0.872
Mi	%equil	0.161	0.590
Mi	colrate	0.005	0.858
Mi	extrate	0.002	0.812
Mi	shannon	0.222	0.875

Mai	K	1177.680	0.797
Mai	c	0.005	0.880
Mai	%equil	0.138	0.754
Mai	colrate	0.006	0.832
Mai	extrate	0.002	0.800

Mmi	K	1320.739	0.771
Mmi	c	0.006	0.869
Mmi	%equil	0.230	0.382
Mmi	colrate	0.005	0.908
Mmi	extrate	0.002	0.704
Mmi	shannon	0.323	0.758

Mami	K	1148.946	0.815
Mami	c	0.005	0.914
Mami	%equil	0.184	0.692
Mami	colrate	0.006	0.831
Mami	extrate	0.002	0.776

Define function to make the fancy plot¶

In [12]:

pretty_params = {"K":"K", "c":"c", "%equil":u"Λ", "colrate":"c'", "extrate":u"†", "shannon":"H'"}
pretty_models = {"Ma":"$\mathregular{M_a}$", "Mi":"$\mathregular{M_i}$", "Mai":"$\mathregular{M_{ai}}$",\
                "Mmi":"$\mathregular{M_{mi}}$", "Mami":"$\mathregular{M_{ami}}$"}

import matplotlib
matplotlib.rc('text', usetex = False)


def plot_cv(model, cv_dict):
    f, axarr = plt.subplots(2, 3, figsize=(8,4), dpi=300)
    axarr = [a for b in axarr for a in b]
    
    for param, ax in zip(params, axarr):
        pname = pretty_params[param]
        if "a" in model and param == "shannon":
            continue
        print(model, param),

        x = cv_dict[model][param]["true"]
        y = cv_dict[model][param]["estim"]
        
        xmin = ymin = min(x)
        xlim = ylim = max(x)
        
        ## linregress gets the rvalue and polyfit/poly1d gives us the fit function so we can draw the regression line
        res = linregress(x,y)
        fit = np.polyfit(x, y, 1)
        fit_fn = np.poly1d(fit)

        ## Calculate RMSE
        err=scipy.sqrt(sum((x-y)**2)/float(len(x)))
        RMSE = "RMSE {0:.4f}".format(cv_dict[model][param]["rms"])
        R2 = "R^2  {0:.4f}".format(cv_dict[model][param]["r"]**2)
        print(RMSE, R2)

        ## Plot the data and the regression line
        ax.scatter(x, y, marker=".", color='w', edgecolors='black', s=10)
        pts = np.linspace(xmin, xlim)
        ax.plot(pts, fit_fn(pts), '-r')

        ## Set titles and x/y limits
        ax.set_xlabel(u"True {}".format(pname), fontsize=12)
        ax.set_ylabel(u"Estimated {}".format(pname), fontsize=12)
        
        ## Don't add plot titles cuz they're obvious
        #ax.set_title(RUN_NAME, fontsize=10)

        ## Plot the identity line
        ax.plot(ax.get_xlim(), ax.get_ylim(), ls="--", c=".3")
        plt.gca().set_facecolor('white')
        ax.grid(False)

        ## Add RMSE and R^2 to plots
        #ax.text(0.2, 0.9, RMSE, ha='center', va='center', transform=ax.transAxes, color='r', size=8)
        #ax.text(0.2, 0.78, R2, ha='center', va='center', transform=ax.transAxes, color='r', size=8)

        ## Remove the stupid tick spines
        ax.tick_params(axis=u'both', which=u'both',length=0)

    ## Give a little space between plots
    plt.subplots_adjust(wspace=0.6, hspace=0.6)
    #plt.tight_layout()
    plt.suptitle(pretty_models[model], fontsize=13)
    plt.savefig("/home/isaac/gimmeSAD/ABC-CV-{}.svg".format(model))

In [13]:

## Useful matplotlib style gallery
## https://tonysyu.github.io/raw_content/matplotlib-style-gallery/gallery.html
## https://matplotlib.org/examples/style_sheets/style_sheets_reference.html
import matplotlib.style as style
for model in models:
#for model in ["Ma"]:
    plot_cv(model, cv_dict)
    print("")

('Ma', 'K') ('RMSE 2017.6106', 'R^2  0.4811')
('Ma', 'c') ('RMSE 0.0147', 'R^2  0.0027')
('Ma', '%equil') ('RMSE 0.1762', 'R^2  0.5766')
('Ma', 'colrate') ('RMSE 0.0143', 'R^2  0.0088')
('Ma', 'extrate') ('RMSE 0.0034', 'R^2  0.3267')

('Mi', 'K') ('RMSE 1073.3866', 'R^2  0.8547')
('Mi', 'c') ('RMSE 0.0051', 'R^2  0.8718')
('Mi', '%equil') ('RMSE 0.1614', 'R^2  0.5896')
('Mi', 'colrate') ('RMSE 0.0055', 'R^2  0.8579')
('Mi', 'extrate') ('RMSE 0.0025', 'R^2  0.8116')
('Mi', 'shannon') ('RMSE 0.2221', 'R^2  0.8749')

('Mai', 'K') ('RMSE 1177.6802', 'R^2  0.7969')
('Mai', 'c') ('RMSE 0.0049', 'R^2  0.8804')
('Mai', '%equil') ('RMSE 0.1381', 'R^2  0.7542')
('Mai', 'colrate') ('RMSE 0.0058', 'R^2  0.8324')
('Mai', 'extrate') ('RMSE 0.0024', 'R^2  0.7999')

('Mmi', 'K') ('RMSE 1320.7387', 'R^2  0.7711')
('Mmi', 'c') ('RMSE 0.0055', 'R^2  0.8688')
('Mmi', '%equil') ('RMSE 0.2297', 'R^2  0.3816')
('Mmi', 'colrate') ('RMSE 0.0051', 'R^2  0.9078')
('Mmi', 'extrate') ('RMSE 0.0019', 'R^2  0.7038')
('Mmi', 'shannon') ('RMSE 0.3227', 'R^2  0.7585')

('Mami', 'K') ('RMSE 1148.9459', 'R^2  0.8153')
('Mami', 'c') ('RMSE 0.0046', 'R^2  0.9138')
('Mami', '%equil') ('RMSE 0.1844', 'R^2  0.6920')
('Mami', 'colrate') ('RMSE 0.0059', 'R^2  0.8307')
('Mami', 'extrate') ('RMSE 0.0022', 'R^2  0.7757')

Everything below here is crap¶

In [78]:

type(results_dict["Mai"]["c"])

Out[78]:

pandas.core.frame.DataFrame

Scenario M(I) - 1D SGD¶

Estimate H' using 1D π vector¶

Estimate Λ using 1D π pi vector¶

Scenario M(MI) - 2D SGD¶

Debugging ABC with NaN in the reference table due to¶

screwed up normalization. FUcKKKKkkK!¶

In [505]:

PRIOR = pd.read_csv("/home/isaac/gimmeSAD/2d_sims/priors.txt", sep="\t")[:2000]
print(len(PRIOR))
prior_args = model_args["Mami"][0:68]
print(prior_args)
cvresults = abc.cv4abc(PRIOR["c"], PRIOR[prior_args], nval=2, tols=robjects.r["c"](0.5), method="loclinear")
print(cvresults)

2000
['shannon', 'bin_0_0', 'bin_0_1', 'bin_0_2', 'bin_0_3', 'bin_0_4', 'bin_0_5', 'bin_0_6', 'bin_0_7', 'bin_0_8', 'bin_0_9', 'bin_1_0', 'bin_1_1', 'bin_1_2', 'bin_1_3', 'bin_1_4', 'bin_1_5', 'bin_1_6', 'bin_1_7', 'bin_1_8', 'bin_1_9', 'bin_2_0', 'bin_2_1', 'bin_2_2', 'bin_2_3', 'bin_2_4', 'bin_2_5', 'bin_2_6', 'bin_2_7', 'bin_2_8', 'bin_2_9', 'bin_3_0', 'bin_3_1', 'bin_3_2', 'bin_3_3', 'bin_3_4', 'bin_3_5', 'bin_3_6', 'bin_3_7', 'bin_3_8', 'bin_3_9', 'bin_4_0', 'bin_4_1', 'bin_4_2', 'bin_4_3', 'bin_4_4', 'bin_4_5', 'bin_4_6', 'bin_4_7', 'bin_4_8', 'bin_4_9', 'bin_5_0', 'bin_5_1', 'bin_5_2', 'bin_5_3', 'bin_5_4', 'bin_5_5', 'bin_5_6', 'bin_5_7', 'bin_5_8', 'bin_5_9', 'bin_6_0', 'bin_6_1', 'bin_6_2', 'bin_6_3', 'bin_6_4', 'bin_6_5', 'bin_6_6']
$calls
$calls$tol0.5
abc(target = target, param = param, sumstat = sumstat, tol = 0.5, 
    subset = subset, method = "loclinear", hcorr = TRUE, transf = "none", 
    logit.bounds = c(0, 0), kernel = NULL)


$cvsamples
[1]  994 1870

$tols
[1] 0.5

$true
          P1
1 0.02971246
2 0.04268392

$estim
$estim$tol0.5
[1] 0.02594129 0.04911940


$names
$names$parameter.names
[1] "P1"

$names$statistics.names
 [1] "shannon" "bin_0_0" "bin_0_1" "bin_0_2" "bin_0_3" "bin_0_4" "bin_0_5"
 [8] "bin_0_6" "bin_0_7" "bin_0_8" "bin_0_9" "bin_1_0" "bin_1_1" "bin_1_2"
[15] "bin_1_3" "bin_1_4" "bin_1_5" "bin_1_6" "bin_1_7" "bin_1_8" "bin_1_9"
[22] "bin_2_0" "bin_2_1" "bin_2_2" "bin_2_3" "bin_2_4" "bin_2_5" "bin_2_6"
[29] "bin_2_7" "bin_2_8" "bin_2_9" "bin_3_0" "bin_3_1" "bin_3_2" "bin_3_3"
[36] "bin_3_4" "bin_3_5" "bin_3_6" "bin_3_7" "bin_3_8" "bin_3_9" "bin_4_0"
[43] "bin_4_1" "bin_4_2" "bin_4_3" "bin_4_4" "bin_4_5" "bin_4_6" "bin_4_7"
[50] "bin_4_8" "bin_4_9" "bin_5_0" "bin_5_1" "bin_5_2" "bin_5_3" "bin_5_4"
[57] "bin_5_5" "bin_5_6" "bin_5_7" "bin_5_8" "bin_5_9" "bin_6_0" "bin_6_1"
[64] "bin_6_2" "bin_6_3" "bin_6_4" "bin_6_5" "bin_6_6"


$seed
  [1]         403          46  1538292208  1399405195  -543460880   456916791
  [7]  1936400845   662470726 -1251068691   397726106 -2048932874  1304404113
 [13] -1349208930 -1927521575  1132655415  1697121592 -2030624449  -399825752
 [19] -1299311804   729006655   131526980 -1443401245   802253985    15549514
 [25]   200000801 -1283983122  1342490162  -594236723  1431614026  -323382387
 [31] -1264324429  2076900404  -773988437 -1052461996  1651715224 -1915576093
 [37] -1594125848  1580262783  1147767285  1591422270  1499890229 -2131980414
 [43]  2029051278  1202199785 -1876960426  1056891825  1778915423  1754531936
 [49]  1828310775  -654368192 -1264166404   902714983  1093652652   599820219
 [55]  2070658393  1731200018  -976139271  1149325606   970771450 -1438928335
 [61]  2020179876 -1929334554   892428704  -772525326   236214347   239111177
 [67]   721745581   708539435  1741608194   167661976  -126411022   232486924
 [73] -1979744263    95509435  -108861737 -1399392451   185661408 -1381116446
 [79]   469604948 -2054334962  1450452383   550429573 -1210525551 -1093946937
 [85]  -565594482  2061616516  1680595430 -1389896136 -1446819507   890542503
 [91]  1533470475  1042486889   969547900   454003998  -241809384  1362050698
 [97]  -891792397   148625153 -1385895563  -950362509  1829038858 -1490835104
[103]   329014074  2108873988  -778320175 -1707362125 -1008111265  -377425867
[109]  -191017896  1767911546   626117932  2034269126   221320871  -150142563
[115]  -759856775  -180579473 -1564196234  -791725108  1900942926  1488063280
[121] -1954970363  1259454815  1581280563  1038080737  -425507180  1865211446
[127] -1418478096 -1908832862  -371292453   715650617   386005405 -1646479141
[133]  -109581230 -1246075192  1991723874  -165351012  -182470967  -975013941
[139]   595389799   -52447923  1735768240 -1568135118 -1635249948  1811634302
[145] -1665322449   240145845   142012129 -1178989161 -2099873410  -855514508
[151]   -34022314  -166226008   447031389 -1614791529   117041179  1795151641
[157]   952838124  1460829550  1206342952 -1718016710  -637761725  1502605169
[163]  1092242789  2087251427   193782490  1316839376  1314478762  2130409172
[169]   -78708895  1802776579  -631907153 -1569440891   713574696 -1579981366
[175] -1623543492   496296566 -1216470345 -2105529651  -724424183 -1114158657
[181]  -650356634   502913532 -1925833730  1343595232    99113749 -1120039345
[187]   819830787  1281645393 -1939402684 -1012749498  -189453888 -1895387502
[193]  2088470635 -1478285591 -1311641139 -1800722485  -850197278 -1999540616
[199]  -891773550   385561580  -782507943 -1782819877 -1784025353   659833629
[205]   486813248  1588724290  1613658740   328889262 -1614375361 -1674488475
[211]   144962417  -885902937  -117981714  1090635108 -1872944954  -354632936
[217]  -577011667  1821114823 -1525670613  -692002551   869884572   395032382
[223] -1553900104 -1126293014  1263947923  -778475551  1145788437  -224558893
[229]  1400038698   161999104  1436315290  -292830364  2049435569   250982611
[235]   963230783  -753415531  1342896696 -1965290022  1202150284  -455779098
[241] -1628184185  2138708541   964131673 -1018387441  -505670634 -1005947604
[247]  1018115630   678988560  -221585115  1636689023   155288147  1693789825
[253] -1170075212   414210198 -1126874736   463270850 -1292345349  -909924711
[259]   594155965 -1727903365 -1776111694  1175680296 -1953165694   -89870084
[265]  1706359721  1155263595   154878599  1043774893 -1475981936   968785042
[271]  1419034500 -1355736418  1382264783  1956898965   756079041  1758135287
[277]   -26423202  -483463980  1109414966   903768840  1216501949  1525285431
[283]  1566803003 -1068627015  1829772428  1497586190 -1033223348  1610805260
[289]  -513109776   523286810 -1337852997  1299933563  1351765583   562596893
[295]  1138249818    33690966   204596478 -1458246644  1510787029  -454208511
[301]  1168947545 -1469633925   -75551360      509240  -827694788  1134176126
[307]   -98537233  2081601551  1430365787  -439223623 -1439187402  -300783814
[313]  1402500642  1601397912   -85490879 -1066077747  1689809821   537659919
[319]  1588847284   646773908   834607336 -1822061614  1233594595  -154765373
[325]  2000652199 -1895392539  -183760078   967905278  -164159338  1506341924
[331]   169316413 -1815680663  -247369695  -520300989  -879796488 -1331153232
[337]   286472356   621804646   513630343  1609473607  -523387389 -1501464015
[343]  -309445890 -2076302990  -546889654  1141468544   131603257 -1713711931
[349]  1161080533    88730023  1766640380 -2133652100  1231317760 -1646380054
[355] -1619783701 -1037485973  1205571231  1064054125    -2922742  -474158202
[361] -1336386418   860451516  -489562715  -720285903 -1018669079  2134755147
[367] -1219887792  1984452744 -1462972116 -1683608210   708880159 -1462809249
[373]    23228299  1004752521 -1682893754  -700272534 -1935085326  -937389688
[379] -1516964559   929427005   -66808947  1253573503   139533380  1089312676
[385]  1927446552   398163970  -395853197 -1205683501  1660099223   692761365
[391]  2120538242  2118611662  1160976902   118527988  1083872301   -34820999
[397]   361981905 -1640292749 -1248500696  1797727584  -701611916  -556031818
[403]   375890647  -556088521  -263762669 -1634413663 -1329769682  -542050494
[409]  1797205498  -472443952   813579785  1781974549  -911528219  1993855607
[415]   590800172 -1484969172 -2118517296  -375206470  -482368421   280376539
[421]  -934495505 -1869852995  -555103494 -1196680266 -1723013794  -197783508
[427]  1815103861   -99512479  -784155015    -1878245  1366359200  1643595608
[433]  1958741916  1069910878   988063119  -155817361  -827009989   548380057
[439] -1976755690 -1165437798    55310082  -600721864  -748693535 -1524434067
[445]   514196285 -1506343825 -1280007532   932678964  1356329224 -1500947598
[451] -1923582013   128429155 -1258242681    45727813  -718990318   178508764
[457]   729927677  -754998446   617326759   269440918   651244701 -1130870298
[463]  1480984945   162628984  -833109067   -65153914  1702048879 -1917684982
[469]   335715025  -239102298  1552854185   -71617636  -936843651 -1180669654
[475]  -545026545 -1370657306  1091672477 -2024422922  -663599119 -1741881496
[481]   900508381  1800281118 -1408508625   -62330054  -370343967  -449938898
[487]   373091553  -847078900   151131133   823307442  1726011799  1837761046
[493]   932223453  -549700218 -1367835855 -1088676440   903977237  -909231290
[499]  1606212047   598437162   702568417  -506830346 -1816665399  -316674500
[505]  -242938515 -1117410358 -1319320353   726204278   661297997  -749572442
[511]   373649281  1871839976  1391561917   412152922 -1931302743 -2015970151
[517]  1198754442  1959373024  -997088839   483179657  1778057744 -2070542142
[523] -1677461915   822747337 -2125403662 -1680057412  1807162709   423551017
[529] -2120887772   337814042  -943749279 -1655581279  2063088578   743226784
[535] -1701913287   627117857  -752810120   120028130   705921941  2009138481
[541]    21469962    44606580  1183070893   395956881   503261844 -1382660726
[547]  -959190199   213808697 -1760462054  -830528352   235583977   493074633
[553]  -153665696  1948256530 -1547738875 -1527456135  1574998754  -767841204
[559]   299192581 -1567901335 -1815699436  1228566666   712812273  1276091761
[565]  1201640002  -179989920  -589044119  -677451663 -2043341256  1767380946
[571]  1935966325  -203501903   821452170  -988840844  1184991245  1698098721
[577]  1061393092 -1401310822  1025763913  1875216601  -522036438   148734848
[583]  1029738073  1026051081  1919097584   293205794   190689637 -1479807223
[589]   781894610  1992261084   349913525  -877001527  1619915972  1230148474
[595] -1727297311 -1325998399  -424633950  -202126400  1305543001   335710177
[601]  1367053432   363533474  1100849749  -504830127 -1083990838  2058213364
[607]  -471327507 -1869589967  1215837780  -336906262    -5316823  1775284409
[613]   171856794   247900928   654071401  1454300681  1006195104  -453177454
[619]    -2162939  2074019929   455698722   -74926708  1061124997  1375809865
[625] -1678314475 -1739761456

attr(,"class")
[1] "cv4abc"

In [506]:

%%R -i PRIOR,prior_args
cv.res.reg <- cv4abc(data.frame(eq=PRIOR[prior_args]), PRIOR[,7], nval=2, tols=c(.5), method="rejection")
print(summary(cv.res.reg))
#plot(cv.res.reg)
#print(cv.res.reg)

Prediction error based on a cross-validation sample of 2

    eq.shannon eq.bin_0_0 eq.bin_0_1 eq.bin_0_2 eq.bin_0_3 eq.bin_0_4
0.5  0.1710794  0.3125000                                            
    eq.bin_0_5 eq.bin_0_6 eq.bin_0_7 eq.bin_0_8 eq.bin_0_9 eq.bin_1_0
0.5                                                         0.2699704
    eq.bin_1_1 eq.bin_1_2 eq.bin_1_3 eq.bin_1_4 eq.bin_1_5 eq.bin_1_6
0.5  0.1448137  0.2500000                                            
    eq.bin_1_7 eq.bin_1_8 eq.bin_1_9 eq.bin_2_0 eq.bin_2_1 eq.bin_2_2
0.5                                   0.5000000  0.5918367  1.0000000
    eq.bin_2_3 eq.bin_2_4 eq.bin_2_5 eq.bin_2_6 eq.bin_2_7 eq.bin_2_8
0.5        Inf                                                       
    eq.bin_2_9 eq.bin_3_0 eq.bin_3_1 eq.bin_3_2 eq.bin_3_3 eq.bin_3_4
0.5                                              1.0000000  1.0000000
    eq.bin_3_5 eq.bin_3_6 eq.bin_3_7 eq.bin_3_8 eq.bin_3_9 eq.bin_4_0
0.5                                                                  
    eq.bin_4_1 eq.bin_4_2 eq.bin_4_3 eq.bin_4_4 eq.bin_4_5 eq.bin_4_6
0.5                                                                  
    eq.bin_4_7 eq.bin_4_8 eq.bin_4_9 eq.bin_5_0 eq.bin_5_1 eq.bin_5_2
0.5                                   1.0000000                      
    eq.bin_5_3 eq.bin_5_4 eq.bin_5_5 eq.bin_5_6 eq.bin_5_7 eq.bin_5_8
0.5                                                                  
    eq.bin_5_9 eq.bin_6_0 eq.bin_6_1 eq.bin_6_2 eq.bin_6_3 eq.bin_6_4
0.5                                                                  
    eq.bin_6_5 eq.bin_6_6
0.5

In [548]:

import seaborn as sns
x = cv_dict[model]["c"]["true"]
y = cv_dict[model]["c"]["estim"]

# cubehelix
ax = sns.kdeplot(x, y, n_levels=100, shade=True, shade_lowest=True, cmap="viridis")
ax.set_facecolor("black")
ax.grid(False)
ax.plot(ax.get_xlim(), ax.get_ylim(), ls="-", c="white", linewidth=1)

Out[548]:

[<matplotlib.lines.Line2D at 0x7fdad2133690>]

In [8]:

ipyclient = ipp.Client()
print(len(ipyclient), 'cores')

Waiting for connection file: ~/.ipython/profile_default/security/ipcontroller-client.json


IOErrorTraceback (most recent call last)
<ipython-input-8-33c416d884f1> in <module>()
----> 1 ipyclient = ipp.Client()
      2 print(len(ipyclient), 'cores')

/home/isaac/miniconda2/lib/python2.7/site-packages/ipyparallel/client/client.pyc in __init__(self, url_file, profile, profile_dir, ipython_dir, context, debug, sshserver, sshkey, password, paramiko, timeout, cluster_id, **extra_args)
    395                         no_file_msg,
    396                     ])
--> 397                     raise IOError(msg)
    398         if url_file is None:
    399             raise IOError(no_file_msg)

IOError: Connection file '~/.ipython/profile_default/security/ipcontroller-client.json' not found.
You have attempted to connect to an IPython Cluster but no Controller could be found.
Please double-check your configuration and ensure that a cluster is running.

In [ ]: