#!/usr/bin/env python
# coding: utf-8
# Make sure you deleted all old version of `dynamo` and installed the newest one from github. The version I used to generate this tutorial is `0.0+89e5bb9.dirty` (see below). `dynamo` is still under active development, but it will be submitted to `PyPi` in one month or so.
# In[1]:
from IPython.core.display import display, HTML
display(HTML(""))
get_ipython().run_line_magic('matplotlib', 'inline')
import dynamo as dyn
dyn.__version__
# ## Dynamo supports a variety of experimental setups
# | Experiment type | Estimation method | Description (has splicing data) | Description (no splicing data) | Parameters suitable to estimate |
# | --- | --- | --- | --- | --- |
# | Kinetics:
multiple time point labeling;
lasts several hours | stochastic (moment equations) | using first/second moments of $uu, ul, su, sl$ (`unspliced unlabeled`, `unspliced labeled`, `spliced unlabled`, `spliced labeled`) to estimate $\alpha, \beta, \gamma$ (`transcription`, `splicing` and `degradataion rate`) jointly.| using first/second moments of $U (uu + su), L (ul + sl) $ (`Unlabeled` or `labeled`) to estimate $\alpha, \gamma$ jointly.| $\beta, \gamma$ |
# | Kinetics:
multiple time point labeling;
lasts several hours | deterministic (ODEs) | 1. use $uu$ over $t$ (time) to estimate $\beta$
2. use the estimated $\beta$ and $ul$ over $t$ to estimate $\alpha$. 3. use the estimated $\beta$ and $uu, su$ over $t$ to estimate $\gamma$
| 1. use $U$ over $t$ to estimate $\gamma$.
2. use the estimated $\gamma$ and $L$ over $t$ to estimate $\alpha$. | $\beta, \alpha$ |
# | one-shot:
single time point labeling;
short time | deterministic (ODEs) | a. if $\beta, \gamma$ are already estimated by a degradation experiment, $\alpha$ is estimated by $ul$ over $t$ and $\beta$.
b. if $\beta, \gamma$ are not estimated, use $uu$ over $t$ to solve for $\beta$ first and then the solved $\beta$ and the $uu$ over $t$ to solve for $\alpha$. | a. if $\gamma$ are already estimated by a degradation experiment, $\alpha$ is estimated by $L$ over $t$ and the estimated $\gamma$.
b. if $\gamma$ are not estimated, use $U$ over $t$ to solve for $\gamma$ first and then the solved $\gamma$ and the $L$ over $t$ to solve for $\alpha$. | $\alpha$. |
# | degradation:
pulse-chase;
can chase more than 1 day | deterministic (ODEs)| 1. use $ul$ over $t$ to estimate $\beta$, the estimated $\beta$ and $ul, sl$ over $t$ to estimate $\gamma$.
2. use the estimated $\beta$ and $uu$ over $t$ to estimate $\alpha$. | 1. use $L$ over $t$ to estimate $\gamma$
2. use $U$ over $t$ to estimate $\alpha$ | $\gamma$ |
# | no metabolic labeling:
regular scRNA-seq data | steady state assumption: $\frac{ds}{dt} = \beta u - \gamma s = 0$ | assume $\beta$ is 1 and then estimate $\gamma$ by using extreme data points of $u$ (`unspliced mRNA`) and $s$ (`spliced mRNA`). | None | All parameters are relative |
# | cite-seq/reap-seq, etc:
RNA/protein coassay | steady state assumption: $\frac{dp}{dt} = \eta s - \delta s = 0$ | assume $\eta$ is 1 and then estimate $\delta$ by using extreme data points of $s$ (`spliced mRNA`) and $p$ (`protein`). | None | All parameters are relative |
# for details related to moment equations, see the full derivation here: https://github.com/aristoteleo/dynamo-notebooks/blob/master/full_derivation.pdf
# for details related to others strategies, see the https://github.com/aristoteleo/dynamo-notebooks/blob/master/velocity_demo/dynamo_fitting_demo.ipynb
#
# We used Gillespie simulation to simulate the cell fate bifurcation. Details can be refered to `supplementary figure 1` of our preprint (https://www.biorxiv.org/content/10.1101/696724v1.full).
#
#
#
# We first simulate cells to achieve a progenitor steady state. Then we simulate the bifurcation of cell fates into different terminal cell states because of drug treatment or external signals. Next we simulate a metabolic labeling experiment and synthesize kinetics experiment data at each checkpoint. Finally, we stimulate a degradation experiment at the begining or end of the cell differentiation experiment. This setup is typical for a labeling experiment.
#
# In[2]:
# Gillespie simulation of a regular labeling experiment for a cell fate bifurcation process.
adata_dif, adata_dif_nosplicing = dyn.sim.Gillespie(method='differentiation', verbose=False)
# In[3]:
# Now let get the kinetics experiment data at check point t_5, the degradation experiment at the beginning or the end of the differentiation process.
adata_dif_kin, adata_dif_deg_begin, adata_dif_deg_end = adata_dif[adata_dif.obs.experiment_type == 'kin_t_5', :], \
adata_dif[adata_dif.obs.experiment_type == 'deg_beign', :], \
adata_dif[adata_dif.obs.experiment_type == 'deg_end', :]
# In[4]:
# Let us show Dynamo's fitting for the beginning of a degradation experiment
adata_dif_deg_begin = dyn.tl.dynamics(adata_dif_deg_begin, experiment_type='deg', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(adata_dif_deg_begin, vkey=adata_dif_deg_begin.var_names, unit='hours')
# from above, we can see that the fitting for `ul` and `sl` are very good but the fitting for `uu` and `su` are not as good. This is because degradation experiment is ideal for estimating $\gamma$ and $\beta$. Note that in real experiments, degradation experiment will chase for up to one day or more, while $\beta$ is often related to just a few minutes so in order to enable accurate $\beta$ estimation, the degradation experiment may need to include a fast labeling period at the beginning, for example including labeling period that separate by just a few minutes or so.
# In[5]:
# moment model aims to learn all parameters jointly and also tries to estimate the RNA bursting related parameters, $a, b, \alpha_i, \alpha_a$. See https://github.com/aristoteleo/dynamo-notebooks/blob/master/full_derivation.pdf.
adata_dif_kin = dyn.tl.dynamics(adata_dif_kin, mode='moment', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(adata_dif_kin, vkey=adata_dif_kin.var_names, unit='hours')
# In[6]:
# kinetics: the data for this mode is the same as the moment approach but it uses least sqaure fitting of the determinstic ODE solutions for fitting.
adata_dif_kin = dyn.tl.dynamics(adata_dif_kin, experiment_type='kin', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(adata_dif_kin, vkey=adata_dif_kin.var_names, unit='hours')
# From above, we can see that the $uu, ul$ fitting are pretty good. This is because the kinetics experiment is good for estimating parameters $\beta, \alpha$. `su` fitting looks good here but in general kinetics experiment only involves at most a few hours, the degradation of mRNA can be up to one or two days, so in pratice the fitting `su` of certain genes in real data may be less optimal. Finally the `sl` fitting is not great, this is because it is involves all parameters $\alpha, \beta, \gamma$ to fit the data and any uncertainty from those parameters can affect the fitting.
# In[7]:
# one_shot. One shot experiment is great for estimating $\alpha$ at different time points, especially if we combined with a degradation experiment and assume constant $\beta, \gamma$ through the entire period.
tmp = dyn.tl.dynamics(adata_dif_kin[adata_dif_kin.obs.time == 0.8, :], experiment_type='one_shot', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(tmp, vkey=adata_dif_kin.var_names, unit='hours')
# # the following are similar analysis if we don't have splicing data
# This is normally true because spliting U to uu, su and L to ul, sl may be difficult especially for some transcription which already have low expression. The discussion above will be generally applicable below but just avoid the consideration of splicing.
# In[8]:
adata_dif_kin_nosplicing, adata_dif_deg_begin_nosplicing, adata_dif_deg_end_nosplicing = adata_dif_nosplicing[adata_dif_nosplicing.obs.experiment_type == 'kin_t_5', :], \
adata_dif_nosplicing[adata_dif_nosplicing.obs.experiment_type == 'deg_beign', :], \
adata_dif_nosplicing[adata_dif_nosplicing.obs.experiment_type == 'deg_end', :]
# In[9]:
adata_dif_deg_begin_nosplicing
# In[10]:
# Degradation
import dynamo as dyn
adata_dif_deg_begin_nosplicing = dyn.tl.dynamics(adata_dif_deg_begin_nosplicing, experiment_type='deg', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(adata_dif_deg_begin_nosplicing, vkey=adata_dif_deg_begin_nosplicing.var_names, unit='hours')
# In[11]:
# moment
adata_dif_kin_nosplicing = dyn.tl.dynamics(adata_dif_kin_nosplicing, mode='moment', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(adata_dif_kin_nosplicing, vkey=adata_dif_kin_nosplicing.var_names, unit='hours')
# In[12]:
# kinetics
adata_dif_kin_nosplicing = dyn.tl.dynamics(adata_dif_kin_nosplicing, experiment_type='kin', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(adata_dif_kin_nosplicing, vkey=adata_dif_kin_nosplicing.var_names, unit='hours')
# In[13]:
# one_shot
tmp = dyn.tl.dynamics(adata_dif_kin_nosplicing[adata_dif_kin_nosplicing.obs.time == 0.8, :], experiment_type='one_shot', filter_gene_mode='no', tkey='time')
dyn.pl.dynamics(tmp, vkey=adata_dif_kin_nosplicing.var_names, unit='hours')
# In[ ]: