This notebook creates a jackknifed PCoA plot based on multiple rarefactions.
from emperor import Emperor, nbinstall
nbinstall()
from skbio.stats.ordination import pcoa
from skbio.diversity import beta_diversity
from biom import load_table
import pandas as pd
import numpy as np
def load_mf(fn, index='#SampleID'):
_df = pd.read_csv(fn, sep='\t', dtype=str, keep_default_na=False, na_values=[])
_df.set_index(index, inplace=True)
return _df
We are going to load data from Fierer et al. 2010 (the data was retrieved from study 232 in Qiita, remember you need to be logged in to access the study).
mf = load_mf('keyboard/mapping-file.txt')
bt = load_table('keyboard/otu-table.biom')
Create 5 rarefactions and compute the Jaccard distance matrix for each resulting table.
ordinations = []
for r in range(5):
rarefied = bt.subsample(1000)
data = np.array([rarefied.data(i) for i in rarefied.ids()], dtype='int64')
res = pcoa(beta_diversity('jaccard', data, rarefied.ids()))
ordinations.append(res)
Jackknifed plots need a master set of coordinates where there rest of the matrices will be centered around.
master = ordinations[0]
jackknifed = ordinations[1:]
If you want to share your notebook via GitHub use remote=True
and make sure you share your notebook using nbviewer. Change the jackknifing method to standard deviation.
viz = Emperor(master, mf, jackknifed=jackknifed, remote=False)
viz.jackknifing_method = 'sdev'
viz
Change the jackknifing method to inter quantile ranges:
viz.jackknifing_method = 'IQR'
viz