This notebooks is complementary to Bergen et al. (2021), *RNA velocity: Current challenges and future perspectives*.
The following examples are provided to discuss potential pitfalls:
In the manuscript, for each example, we provide guidance on how the ensuing challenges may be addressed.
import numpy as np
import pandas as pd
import scvelo as scv
%load_ext autoreload
%autoreload 2
scv.set_figure_params(dpi=100, fontsize=16, facecolor='none') # set dpi=200 to generate high-res figures
scv.settings.verbosity = 1 # set to 3 to see more information
scv.logging.print_version()
Running scvelo 0.2.4.dev56+g12a5e9c (python 3.8.0) on 2021-08-25 08:29.
As described in the seminal works (La Manno et al, 2018; Bergen et al, 2020), some genes show multiple kinetic regimes across subpopulations and lineages (Fig. 2a). These can be governed by variations in kinetic rates, and manifest as multiple trajectories in phase space. Additionally, we frequently find subpopulations to show a straight line in phase portraits since the expression kinetics is only partially observed. This lack of curvature challenges current models when determining whether an up- or down-regulation should be fit. In fact, the lack of curvature reveals another fundamental assumption for extracting dynamic information from spliced and unspliced abundances: The expression kinetics must be sufficiently observed and occur on a comparable time scale as the differentiation process. A timescale mismatch would lead to phase portraits showing a noisy blob or a straight line rather than an interpretable curvature.
Proposed Solution: State-dependent rates, multi-modal omics (see mannuscript).
Reproducing Suppl. Fig. 7 from La Manno et al. (Nature, 2018).
adata = scv.datasets.dentategyrus_lamanno()
scv.pl.scatter(adata, legend_fontsize=12, legend_align_text=True, title='', save='DG_large.png')
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
saving figure to file ./figures/scvelo_DG_large.png
var_names = ['Ntrk2', 'Meg3', 'Syngr1']
scv.pp.filter_and_normalize(adata, min_shared_counts=30, n_top_genes=2000, retain_genes=var_names)
scv.pp.moments(adata, n_neighbors=100)
scv.tl.velocity(adata)
kwargs = dict(linecolor='orange', s=30, linewidth=2, frameon='artist', dpi=80)
scv.pl.scatter(adata, basis=['tsne'] + var_names, vkey='velocity', **kwargs)