import episcanpy.api as epi
The initial step is to download the test dataset on: https://drive.google.com/open?id=17FoiBCFX5Gv4RQYX1nS53QPEOI-U6AQN
Then specify the path to the test dataset.
path_to_play_data = '../ATAC_play_data/'
file_annot_name = "cortex_enhancer.bed"
# list of the bam files you want to build a count matrix for
list_cells =['AGCGATAGAACGAATTCGACTCGTATCACAGGACGT.bam',
'AGCGATAGAACGAATTCGCCGACTCCAAAGGCGAAG.bam',
'AGCGATAGAACGAATTCGCATATCCTATGGCTCTGA.bam',
'AGCGATAGAACGAATTCGACTCGTATCAAGGCGAAG.bam',
'AGCGATAGAACGAATTCGACCTACGCCAGGCTCTGA.bam'
]
Load the annoation file (peaks or enhancers) with the right set of chromosome names
enhancers = epi.ct.load_features(file_annot_name)
enhancer_names = epi.ct.name_features(enhancers)
Let's now generate the count matrix.
Important limitation. You can build only one count matrix with the function bld_atac_mtx whereas the methylation where you can build multiple data matrices at the same time.
epi.ct.bld_atac_mtx(list_bam_files=list_cells,
loaded_feat=enhancers,
output_file='test_ATAC_mtx.txt',
path=path_to_play_data,
writing_option='w',
header=enhancer_names)
Converting to a sparse matrix (AnnData object)
epi.ct.save_sparse_mtx(initial_matrix='test_ATAC_mtx.txt',
output_file='.h5ad',
path=path_to_play_data)