Overview¶

This notebook shows how to access and interact with 2-photon calcium imaging data collected as part of the Allen Institute's Visual Behavior 2P project.

You can learn more about this dataset, behavioral task, and find other useful tools here: Overview page and Allen Brain Atlas

Specifically, this notebook will show how to load neural data for all imaging planes in one 2-photon imaging session into a single 'tidy' dataframe, make simple event-triggered plots, and do some basic analysis using scikit-learn.

This is designed to demonstrate a simple method for interacting with the Visual Behavior 2P data. Many aspects of the dataset are not explored here.

Set up environment and import packages¶

We have built a package called brain_observatory_utilities which contains some useful convenience functions. The allenSDK is a dependency of this package and will be automatically installed when you install brain_observatory_utilities per the instructions below.

We will first install brain_observatory_utilities into our colab environment by running the commands below. When this cell is complete, click on the RESTART RUNTIME button that appears at the end of the output. Note that running this cell will produce a long list of outputs and some error messages. Clicking RESTART RUNTIME at the end will resolve these issues.

You can minimize the cell after you are done to hide the output.

In [ ]:

# @title Install packages
!pip install pip --upgrade --quiet
!pip install brain_observatory_utilities --upgrade --quiet
!pip install pandas --quiet
!pip install seaborn --quiet

Next we will import packages we need later in the notebook¶

In [ ]:

import os
import numpy as np
import pandas as pd
from tqdm import tqdm
import seaborn as sns
import matplotlib.pyplot as plt

from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.manifold import TSNE

import brain_observatory_utilities.datasets.optical_physiology.data_formatting as ophys_formatting
import brain_observatory_utilities.utilities.general_utilities as utilities

from allensdk.brain_observatory.behavior.behavior_project_cache import VisualBehaviorOphysProjectCache

pd.set_option('display.max_columns', 500)
# this line may be needed if you run into Error in pandas query function
# Otherwise set the engine to python in queries made throughout the book
# pd.DataFrame.query = lambda self, expr, **kwargs: self.query(expr, engine='python', **kwargs)

Load the session and experiment summary tables¶

The AllenSDK provides functionality for downloading tables that describe all sessions and experiments (individual imaging planes) in the Visual Behavior 2P dataset. We first download the data cache:

In [ ]:

data_storage_directory = "./temp"  # Note: this path must exist on your local drive
cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)

ophys_session_table.csv: 100%|██████████| 247k/247k [00:00<00:00, 2.32MMB/s] 
behavior_session_table.csv: 100%|██████████| 1.59M/1.59M [00:00<00:00, 3.54MMB/s]
ophys_experiment_table.csv: 100%|██████████| 657k/657k [00:00<00:00, 3.22MMB/s] 
ophys_cells_table.csv: 100%|██████████| 4.28M/4.28M [00:00<00:00, 6.32MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/brain_observatory/behavior/behavior_project_cache/behavior_project_cache.py:135: UpdatedStimulusPresentationTableWarning: 
	As of AllenSDK version 2.16.0, the latest Visual Behavior Ophys data has been significantly updated from previous releases. Specifically the user will need to update all processing of the stimulus_presentations tables. These tables now include multiple stimulus types delineated by the columns `stimulus_block` and `stimulus_block_name`.

The data that was available in previous releases are stored in the block name containing 'change_detection' and can be accessed in the pandas table by using: 
	`stimulus_presentations[stimulus_presentations.stimulus_block_name.str.contains('change_detection')]`
  warnings.warn(

Ophys_session_table contains metadata describing imaging sessions. If more than one plane was imaged during a session, one ophys session id will be associated multiple ophys experiment ids. Each ophys session id will also have a unique behavior session id.
Behavior_session_table contains metadata describing behavioral sessions, which may or may not be during imaging. Behavior session ids that do not have ophys session ids were training sessions.
Ophys_experiment_table contains metadata describing imaging experiments (aka imaging planes). When mesoscope is used, one ophys session may contain up to 8 unique experiments (two visual areas by four imaging depths). Some imaging planes may not be released due to quality control issues, thus each ophys session id is associated with anywhere from one to eight unique experiment ids. Ophys experiment ids are unique and do not repeat across sessions. To find the same imaging plane that was matched across multiple sessions, use the ophys_container_id column that can be found in both ophys_session_table and ophys_experiment_table.

Then we can access the session and experiment tables directly.

Note that a 'session' is a single behavioral session. Sessions that are performed on the mesoscope will have multiple (up to 8) 'experiments' associated with them, where an experiment is a distinct imaging plane.

In [ ]:

session_table = cache.get_ophys_session_table()
experiment_table = cache.get_ophys_experiment_table()

We can then view the contents of the session table. Note that this contains a lot of useful metadata about each session. One of the columns, ophys_experiment_id provides a list of the experiments (aka imaging planes) that are associated with each session.

In [ ]:

session_table.head()

Out[ ]:

	behavior_session_id	ophys_container_id	mouse_id	indicator	full_genotype	driver_line	cre_line	reporter_line	sex	age_in_days	imaging_plane_group_count	project_code	session_type	session_number	image_set	behavior_type	experience_level	prior_exposures_to_session_type	prior_exposures_to_image_set	prior_exposures_to_omissions	date_of_acquisition	equipment_name	num_depths_per_area	ophys_experiment_id	num_targeted_structures
ophys_session_id
951410079	951520319	[1018028339, 1018028342, 1018028345, 101802835...	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	206	4	VisualBehaviorMultiscope	OPHYS_1_images_A	1	images_A	active_behavior	Familiar	0	65	0	2019-09-20 09:59:38.837000+00:00	MESO.1	4	[951980471, 951980473, 951980475, 951980479, 9...	2
952430817	952554548	[1018028339, 1018028345, 1018028354, 1018028357]	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	209	3	VisualBehaviorMultiscope	OPHYS_2_images_A_passive	2	images_A	passive_viewing	Familiar	0	66	1	2019-09-23 08:45:38.490000+00:00	MESO.1	4	[953659743, 953659745, 953659749, 953659752]	2
954954402	953982960	[1018028339, 1018028342, 1018028345, 101802835...	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	210	4	VisualBehaviorMultiscope	OPHYS_3_images_A	3	images_A	active_behavior	Familiar	0	67	2	2019-09-24 09:01:31.582000+00:00	MESO.1	4	[958527464, 958527471, 958527474, 958527479, 9...	2
955775716	956010809	[1018028339, 1018028342, 1018028345]	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	212	2	VisualBehaviorMultiscope	OPHYS_3_images_A	3	images_A	active_behavior	Familiar	1	68	3	2019-09-26 09:22:21.772000+00:00	MESO.1	4	[956941841, 956941844, 956941846]	2
957020350	957032492	[1018028339, 1018028342, 1018028345, 101802835...	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	213	4	VisualBehaviorMultiscope	OPHYS_4_images_B	4	images_B	active_behavior	Novel 1	0	0	4	2019-09-27 08:58:37.005000+00:00	MESO.1	4	[957759562, 957759564, 957759566, 957759570, 9...	2

The experiment table has one row per experiment. Note that the ophys_session_id column links each experiment to its associated session in the session_table.

In [ ]:

experiment_table.head()

Out[ ]:

	behavior_session_id	ophys_session_id	ophys_container_id	mouse_id	indicator	full_genotype	driver_line	cre_line	reporter_line	sex	age_in_days	imaging_depth	targeted_structure	targeted_imaging_depth	imaging_plane_group	project_code	session_type	session_number	image_set	behavior_type	passive	experience_level	prior_exposures_to_session_type	prior_exposures_to_image_set	prior_exposures_to_omissions	date_of_acquisition	equipment_name	published_at	isi_experiment_id	file_id
ophys_experiment_id
951980471	951520319	951410079	1018028342	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	206	150	VISp	150	0	VisualBehaviorMultiscope	OPHYS_1_images_A	1	A	active_behavior	False	Familiar	0	65	0	2019-09-20 09:59:38.837000+00:00	MESO.1	2021-03-25	848974280	0
951980473	951520319	951410079	1018028345	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	206	225	VISp	225	0	VisualBehaviorMultiscope	OPHYS_1_images_A	1	A	active_behavior	False	Familiar	0	65	0	2019-09-20 09:59:38.837000+00:00	MESO.1	2021-03-25	848974280	1
951980475	951520319	951410079	1018028339	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	206	75	VISp	75	1	VisualBehaviorMultiscope	OPHYS_1_images_A	1	A	active_behavior	False	Familiar	0	65	0	2019-09-20 09:59:38.837000+00:00	MESO.1	2021-03-25	848974280	2
951980479	951520319	951410079	1018028354	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	206	150	VISl	150	2	VisualBehaviorMultiscope	OPHYS_1_images_A	1	A	active_behavior	False	Familiar	0	65	0	2019-09-20 09:59:38.837000+00:00	MESO.1	2021-03-25	848974280	3
951980481	951520319	951410079	1018028357	457841	GCaMP6f	Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt	[Sst-IRES-Cre]	Sst-IRES-Cre	Ai148(TIT2L-GC6f-ICL-tTA2)	F	206	225	VISl	225	2	VisualBehaviorMultiscope	OPHYS_1_images_A	1	A	active_behavior	False	Familiar	0	65	0	2019-09-20 09:59:38.837000+00:00	MESO.1	2021-03-25	848974280	4

Load one example session¶

We are going to select one session from this table, session 854060305. This is a session with Sst-IRES-Cre mouse, which expressed GCaMP6f in Sst+ inhibitory interneurons. There were 6 simultaneously acquired imaging planes for this session. We can view metadata for this session as follows:

In [ ]:

ophys_session_id = 854060305
session_table.loc[ophys_session_id]

Out[ ]:

behavior_session_id                                                        854283407
ophys_container_id                 [1018028135, 1018028138, 1018028141, 101802814...
mouse_id                                                                      440631
indicator                                                                    GCaMP6f
full_genotype                          Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt
driver_line                                                           [Sst-IRES-Cre]
cre_line                                                                Sst-IRES-Cre
reporter_line                                             Ai148(TIT2L-GC6f-ICL-tTA2)
sex                                                                                M
age_in_days                                                                      129
imaging_plane_group_count                                                          3
project_code                                                VisualBehaviorMultiscope
session_type                                                        OPHYS_6_images_B
session_number                                                                     6
image_set                                                                   images_B
behavior_type                                                        active_behavior
experience_level                                                            Novel >1
prior_exposures_to_session_type                                                    0
prior_exposures_to_image_set                                                       2
prior_exposures_to_omissions                                                       6
date_of_acquisition                                 2019-04-19 09:21:45.638000+00:00
equipment_name                                                                MESO.1
num_depths_per_area                                                                4
ophys_experiment_id                [854759890, 854759894, 854759896, 854759898, 8...
num_targeted_structures                                                            2
Name: 854060305, dtype: object

Download all associated experiments¶

Each session consists of one or more 'experiments', in which each experiment is a single imaging plane

Each mesoscope session has up to 8 experiments associated with the session. We will load all sessions into a dictionary with the experiment IDs as the keys

The first time that this cell is run, the associated NWB files will be downloaded to your local data_storage_directory. Subsequent runs of this cell will be faster since the data will already be cached locally.

In [ ]:

experiments = {}
ophys_experiment_ids = session_table.loc[ophys_session_id]['ophys_experiment_id']
for ophys_experiment_id in ophys_experiment_ids:
    experiments[ophys_experiment_id] = cache.get_behavior_ophys_experiment(ophys_experiment_id)

behavior_ophys_experiment_854759890.nwb: 100%|██████████| 232M/232M [00:08<00:00, 25.8MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded.
  return func(args[0], **pargs)
behavior_ophys_experiment_854759894.nwb: 100%|██████████| 252M/252M [00:09<00:00, 25.7MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded.
  return func(args[0], **pargs)
behavior_ophys_experiment_854759896.nwb: 100%|██████████| 232M/232M [00:07<00:00, 30.1MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded.
  return func(args[0], **pargs)
behavior_ophys_experiment_854759898.nwb: 100%|██████████| 246M/246M [00:08<00:00, 28.7MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded.
  return func(args[0], **pargs)
behavior_ophys_experiment_854759900.nwb: 100%|██████████| 243M/243M [00:08<00:00, 28.1MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded.
  return func(args[0], **pargs)
behavior_ophys_experiment_854759903.nwb: 100%|██████████| 250M/250M [00:08<00:00, 30.0MMB/s]
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded.
  return func(args[0], **pargs)

View the max projection and one cell ROI for one of the experiments¶

We can view the cell_specimen_table for one experiment, which contains information about each identified cell in that experiment

In [ ]:

experiment = experiments[ophys_experiment_ids[1]]
experiment.cell_specimen_table.head()

Out[ ]:

	cell_roi_id	height	mask_image_plane	max_correction_down	max_correction_left	max_correction_right	max_correction_up	valid_roi	width	x	y	roi_mask
cell_specimen_id
1086557083	1080855636	17	0	5.0	7.0	4.0	7.0	True	19	173	384	[[False, False, False, False, False, False, Fa...
1086557639	1080855643	16	0	5.0	7.0	4.0	7.0	True	14	363	447	[[False, False, False, False, False, False, Fa...
1086559064	1080855660	19	1	5.0	7.0	4.0	7.0	True	17	24	221	[[False, False, False, False, False, False, Fa...
1086558114	1080855673	18	0	5.0	7.0	4.0	7.0	True	13	74	305	[[False, False, False, False, False, False, Fa...
1086558224	1080855678	19	0	5.0	7.0	4.0	7.0	True	18	478	284	[[False, False, False, False, False, False, Fa...

We can then visualize the max projection and one of the identified ROIs

In [ ]:

fig, ax = plt.subplots(1, 2, figsize=(15, 8), sharex=True, sharey=True)
ax[0].imshow(experiment.max_projection, cmap='gray')
ax[0].set_title('max projection')

cell_specimen_id = experiment.cell_specimen_table.index[2]
ax[1].imshow(experiment.cell_specimen_table.loc[cell_specimen_id]['roi_mask'])
ax[1].set_title('ROI mask for cell_specimen_id = {}'.format(cell_specimen_id))
fig.show()

Load neural data into memory¶

The cell below will load the neural data into memory in the pandas 'tidy' format by iterating over each of the 6 experiments and using some helpful tools from the brain_observatory_utilities package that was imported above as ophys.

It will also include a subset of metadata from ophys_experiment_table to facilitate splitting by depth, structure (aka cortical area), cre line (aka cell class), etc.

Note that 'tidy' data means that each row represents only one observation. Observations are stacked vertically. Thus, the timestamps columns will repeat for every cell in the dataset.

In [ ]:

neural_data = []
for ophys_experiment_id in tqdm(experiments.keys()): #tqdm is a package that shows progress bars for items that are iterated over
    this_experiment = experiments[ophys_experiment_id]
    this_experiment_neural_data = ophys_formatting.build_tidy_cell_df(this_experiment)

    # add some columns with metadata for the experiment
    metadata_keys = [
      'ophys_experiment_id',
      'ophys_session_id',
      'targeted_structure',
      'imaging_depth',
      'equipment_name',
      'cre_line',
      'mouse_id',
      'sex',
    ]
    for metadata_key in metadata_keys:
        this_experiment_neural_data[metadata_key] = this_experiment.metadata[metadata_key]

    # append the data for this experiment to a list
    neural_data.append(this_experiment_neural_data)

# concatate the list of dataframes into a single dataframe
neural_data = pd.concat(neural_data)

100%|██████████| 6/6 [00:00<00:00,  6.68it/s]

We can then look at some attributes of the neural_data dataframe we have created.

It is ~2.5 million rows long:

In [ ]:

len(neural_data)

Out[ ]:

It is so long because has one row for each timestamp for each cell.

Below are the first 5 entries. Again, note that the tidy format means that each row has only one observation, which represents a single GCaMP6 fluorescnce value for a single neuron.

In [ ]:

neural_data.head()

Out[ ]:

	timestamps	dff	cell_roi_id	cell_specimen_id	ophys_experiment_id	ophys_session_id	targeted_structure	imaging_depth	equipment_name	cre_line	mouse_id	sex
0	10.52216	0.400583	1080852071	1086550481	854759890	854060305	VISp	275	MESO.1	Sst-IRES-Cre	440631	M
1	10.61538	0.126125	1080852071	1086550481	854759890	854060305	VISp	275	MESO.1	Sst-IRES-Cre	440631	M
2	10.70860	-0.083087	1080852071	1086550481	854759890	854060305	VISp	275	MESO.1	Sst-IRES-Cre	440631	M
3	10.80182	0.158960	1080852071	1086550481	854759890	854060305	VISp	275	MESO.1	Sst-IRES-Cre	440631	M
4	10.89504	0.301507	1080852071	1086550481	854759890	854060305	VISp	275	MESO.1	Sst-IRES-Cre	440631	M

The cell_roi_id column contains unique roi ids for all cells in a given experiment, which do not repeat across ophys sessions.
The cell_specimen_id column contains unique ids for cells that were matched across ophys sessions. Thus, a cell that was imaged in more than one session has multiple roi ids but one cell specimen id.

Examine Cell IDs¶

We can get the unique Cell IDs in our dataset as follows:

In [ ]:

cell_ids = neural_data['cell_specimen_id'].unique()
print('there are {} unique cells'.format(len(cell_ids)))
print('cell ids are: {}'.format(cell_ids))

there are 53 unique cells
cell ids are: [1086550481 1086551114 1086551301 1086557083 1086557639 1086559064
 1086558114 1086558224 1086558510 1086559206 1086557304 1086557208
 1086560061 1086559681 1086559885 1086559968 1086557470 1086547796
 1086547993 1086548118 1086554566 1086556653 1086558574 1086552296
 1086558071 1086556532 1086555222 1086558701 1086557434 1086556317
 1086555835 1086549726 1086553836 1086551540 1086551151 1086550544
 1086552709 1086553271 1086553602 1086555553 1086548072 1086553899
 1086547630 1086549303 1086549491 1086549813 1086549949 1086548658
 1086548969 1086551457 1086551645 1086550990 1086551209]

If we wanted to get the timeseries for one cell, we could query the neural_data dataframe. For example, to get the full timeseries for the cell with cell_specimen_id = 1086557208:

In [ ]:

single_cell_timeseries = neural_data.query('cell_specimen_id == 1086557208')
single_cell_timeseries.head()

Out[ ]:

	timestamps	dff	cell_roi_id	cell_specimen_id	ophys_experiment_id	ophys_session_id	targeted_structure	imaging_depth	equipment_name	cre_line	mouse_id	sex
0	10.52216	0.218961	1080855724	1086557208	854759894	854060305	VISp	179	MESO.1	Sst-IRES-Cre	440631	M
1	10.61538	0.232865	1080855724	1086557208	854759894	854060305	VISp	179	MESO.1	Sst-IRES-Cre	440631	M
2	10.70860	-0.050186	1080855724	1086557208	854759894	854060305	VISp	179	MESO.1	Sst-IRES-Cre	440631	M
3	10.80182	0.239468	1080855724	1086557208	854759894	854060305	VISp	179	MESO.1	Sst-IRES-Cre	440631	M
4	10.89504	0.226356	1080855724	1086557208	854759894	854060305	VISp	179	MESO.1	Sst-IRES-Cre	440631	M

Each cell has three types of traces:

dff column is the Calcium fluorescence signal, normalized to background fluorescence.
events column is deconvolved events from dff trace, which approximates neural firing rate and removes the slow decay of the Calcium signal (for more details, you can read EVENT DETECTION section in Visual Behavior whitepaper).
filtered_events column is events smoothed with a half-gaussian kernel.

We can then plot DeltaF/F for this cell for the full experiment as follows:

In [ ]:

fig, ax = plt.subplots(figsize=(15,5))
single_cell_timeseries.plot(
    x = 'timestamps',
    y = 'dff',
    ax = ax
)
fig.show()

Load stimulus data into memory¶

The stimulus table is shared across all experiments (imaging planes) in a session. We can therefore use the stimulus table for just one experiment.

We are going to drop the image_set column because it is not informative for our purposes. We can then view the first 10 rows of the stimulus table.

In [ ]:

stimulus_table = experiments[ophys_experiment_ids[0]].stimulus_presentations
stimulus_table.head(10)

Out[ ]:

	stimulus_block	stimulus_block_name	image_index	image_name	movie_frame_index	duration	start_time	end_time	start_frame	end_frame	is_change	is_image_novel	omitted	movie_repeat	flashes_since_change	trials_id	stimulus_name	is_sham_change	active
stimulus_presentations_id
0	0	initial_gray_screen_5min	-99	NaN	-99	310.569786	0.000000	310.569786	0	17985	False	<NA>	<NA>	-99	0	-99	spontaneous	False	False
1	1	change_detection_behavior	0	im000	-99	0.250210	310.569786	310.819996	17985	18000	False	False	False	-99	1	0	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
2	1	change_detection_behavior	0	im000	-99	0.250200	311.320396	311.570596	18030	18045	False	False	False	-99	2	0	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
3	1	change_detection_behavior	0	im000	-99	0.250170	312.071016	312.321186	18075	18090	False	False	False	-99	3	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
4	1	change_detection_behavior	0	im000	-99	0.250190	312.821616	313.071806	18120	18135	False	False	False	-99	4	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
5	1	change_detection_behavior	0	im000	-99	0.250210	313.572196	313.822406	18165	18180	False	False	False	-99	5	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
6	1	change_detection_behavior	0	im000	-99	0.250230	314.322816	314.573046	18210	18225	False	False	False	-99	6	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
7	1	change_detection_behavior	0	im000	-99	0.250210	315.073456	315.323666	18255	18270	False	False	False	-99	7	2	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
8	1	change_detection_behavior	0	im000	-99	0.250150	315.824126	316.074276	18300	18315	False	False	False	-99	8	2	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
9	1	change_detection_behavior	0	im000	-99	0.250220	316.574676	316.824896	18345	18360	False	False	False	-99	9	2	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True

This table provides helpful information like image name, start, duration and stop of image presentation, and whether the image was omitted. stimulus_block and stimulus_block_name indicate the type of stimulus mice were presented at a given point in a session. To select active change detection behavior, first we need to filter the table for change_detection_behavior or 1 block. Note that sessions may have different number of stimulus blocks, thus change_detection_behavior may be associated with either 0 or 1 in stimulus_block column.

In [ ]:

stimulus_table.stimulus_block_name.unique()

Out[ ]:

array(['initial_gray_screen_5min', 'change_detection_behavior',
       'post_behavior_gray_screen_5min', 'natural_movie_one'],
      dtype=object)

In [ ]:

stimulus_table = stimulus_table[stimulus_table.stimulus_block_name=='change_detection_behavior']
stimulus_table.reset_index(drop=True, inplace=True) # resetting index starts df at stimulus 0
# give index a name
stimulus_table.index.name = 'stimulus_presentations_id'
stimulus_table.head(5)

Out[ ]:

	stimulus_block	stimulus_block_name	image_index	image_name	movie_frame_index	duration	start_time	end_time	start_frame	end_frame	is_change	is_image_novel	omitted	movie_repeat	flashes_since_change	trials_id	stimulus_name	is_sham_change	active
stimulus_presentations_id
0	1	change_detection_behavior	0	im000	-99	0.25021	310.569786	310.819996	17985	18000	False	False	False	-99	1	0	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
1	1	change_detection_behavior	0	im000	-99	0.25020	311.320396	311.570596	18030	18045	False	False	False	-99	2	0	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
2	1	change_detection_behavior	0	im000	-99	0.25017	312.071016	312.321186	18075	18090	False	False	False	-99	3	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
3	1	change_detection_behavior	0	im000	-99	0.25019	312.821616	313.071806	18120	18135	False	False	False	-99	4	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
4	1	change_detection_behavior	0	im000	-99	0.25021	313.572196	313.822406	18165	18180	False	False	False	-99	5	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True

View the `stimulus_templates` attribute¶

Note that the unwarped column contains the image before the application of a spherical warp. All of the pixels labeled 'NaN' will be off-screen (not visible to the mouse) after the warp is applied.

All experiments in a given session will share the same stimulus_templates

In [ ]:

experiment = experiments[ophys_experiment_ids[0]]
experiment.stimulus_templates

Out[ ]:

	unwarped	warped
image_name
im000	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[122, 122, 123, 125, 126, 127, 128, 129, 130,...
im106	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[108, 109, 106, 103, 102, 104, 107, 112, 117,...
im075	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[120, 121, 121, 121, 122, 123, 123, 122, 121,...
im073	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[120, 120, 118, 116, 116, 119, 121, 120, 117,...
im045	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[10, 13, 6, 0, 0, 8, 15, 13, 6, 2, 4, 9, 12, ...
im054	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[124, 125, 127, 130, 133, 134, 136, 138, 140,...
im031	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[233, 234, 244, 253, 253, 244, 237, 239, 246,...
im035	[[nan, nan, nan, nan, nan, nan, nan, nan, nan,...	[[178, 181, 189, 198, 200, 198, 196, 199, 205,...

View the unwarped images¶

In [ ]:

fig, ax = plt.subplots(2, 4, figsize=(20, 8), sharex=True, sharey=True)
for ii, image_name in enumerate(experiment.stimulus_templates.index):
    ax.flatten()[ii].imshow(experiment.stimulus_templates.loc[image_name]['unwarped'], cmap='gray')
    ax.flatten()[ii].set_title(image_name)
fig.tight_layout()
fig.show()

View the warped images¶

This represents what was actually on the screen during the session

In [ ]:

fig, ax = plt.subplots(2, 4, figsize=(20, 8), sharex=True, sharey=True)
for ii, image_name in enumerate(experiment.stimulus_templates.index):
    ax.flatten()[ii].imshow(experiment.stimulus_templates.loc[image_name]['warped'], cmap='gray')
    ax.flatten()[ii].set_title(image_name)
fig.tight_layout()
fig.show()

Describe stimulus omissions¶

An important feature of the task is that stimuli are shown at a very regular cadence (250 ms on, 500 ms off), but stimuli are randomly omitted with a probability of ~5%. These unexpected and random stimulus omissions could be perceived as an expectation violation by the mouse.

Omitted stimuli are denoted in the stimulus_table by the omitted column. True means that the stimulus that would have been shown at that time was actually omitted (and was replaced by an extended gray screen between stimuli).

We can look at the first 10 examples of omitted stimuli as follows. Note that each 'omitted' stimulus still has a 'start_time' and a 'stop_time' associated with it. This actually represents the time that a stimulus would have been shown, had it not been omitted.

Stimulus omissions are also indicated in the image_name column by the string omitted

In [ ]:

stimulus_table.query('omitted', engine='python').head(10)

Out[ ]:

	stimulus_block	stimulus_block_name	image_index	image_name	movie_frame_index	duration	start_time	end_time	start_frame	end_frame	is_change	is_image_novel	omitted	movie_repeat	flashes_since_change	trials_id	stimulus_name	is_sham_change	active
stimulus_presentations_id
61	1	change_detection_behavior	8	omitted	-99	0.25	356.373816	356.623816	20731	20746	False	<NA>	True	-99	2	7	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
105	1	change_detection_behavior	8	omitted	-99	0.25	389.400806	389.650806	22711	22726	False	<NA>	True	-99	0	12	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
113	1	change_detection_behavior	8	omitted	-99	0.25	395.405686	395.655686	23071	23086	False	<NA>	True	-99	7	13	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
128	1	change_detection_behavior	8	omitted	-99	0.25	406.664926	406.914926	23746	23761	False	<NA>	True	-99	21	15	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
143	1	change_detection_behavior	8	omitted	-99	0.25	417.940786	418.190786	24422	24437	False	<NA>	True	-99	35	18	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
172	1	change_detection_behavior	8	omitted	-99	0.25	439.708536	439.958536	25727	25742	False	<NA>	True	-99	8	21	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
174	1	change_detection_behavior	8	omitted	-99	0.25	441.209796	441.459796	25817	25832	False	<NA>	True	-99	9	21	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
243	1	change_detection_behavior	8	omitted	-99	0.25	493.018766	493.268766	28923	28938	False	<NA>	True	-99	0	28	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
254	1	change_detection_behavior	8	omitted	-99	0.25	501.275466	501.525466	29418	29433	False	<NA>	True	-99	0	29	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
267	1	change_detection_behavior	8	omitted	-99	0.25	511.033476	511.283476	30003	30018	False	<NA>	True	-99	12	32	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True

Create an event triggered response dataframe relative to omissions¶

If we want to see how a given cell responds when regularly flashed stimuli are omitted, we can calculate the response around each of the stimulus omissions. The brain_observatory_utilities package has a convenience function to do this, in the module we imported as utilities. We give the function:

a dataframe of interest (containing activity from one cell)
the t and y values of interest
the event times
how much time before and after each event we are interested in
the desired sampling rate of the output - this is the rate onto which the response will be interpolated

The function will return a new dataframe with the response for the given cell, aligned to each of the events.

In [ ]:

cell_id = cell_ids[11]
etr = utilities.event_triggered_response(
    data=neural_data.query('cell_specimen_id == @cell_id'),
    t='timestamps',
    y='dff',
    event_times=stimulus_table.query('omitted', engine='python')['start_time'],
    t_before=3,
    t_after=3,
    output_sampling_rate=50,
)
etr

Out[ ]:

	time	dff	event_number	stimulus_presentations_id	event_time
0	-3.00	0.043293	0	61	356.373816
1	-2.98	0.036577	0	61	356.373816
2	-2.96	0.021164	0	61	356.373816
3	-2.94	0.005752	0	61	356.373816
4	-2.92	-0.009661	0	61	356.373816
...	...	...	...	...	...
55680	2.92	0.001923	184	4796	3911.326506
55681	2.94	0.001321	184	4796	3911.326506
55682	2.96	0.001321	184	4796	3911.326506
55683	2.98	0.001321	184	4796	3911.326506
55684	3.00	0.001321	184	4796	3911.326506

55685 rows × 5 columns

We can see that the output has columns for

time - this is our new timebase relative to the events. In this case, it ranges from -3 to 3
dff - this is the deltaF/F value surrounding each event, interpolated onto the new timebase. If, when calling the event_triggered_response function we had passed y = 'events', this column would be events instead of dff.
event_number - this is an integer representing the count of each event. In this example, there were 185 omissions, so they are numbered from 0 to 184
event_time - this is the time of each event

The docstring for the event_triggered_response function can be viewed as follows:

In [ ]:

help(utilities.event_triggered_response)

Help on function event_triggered_response in module brain_observatory_utilities.utilities.general_utilities:

event_triggered_response(data, t, y, event_times, t_start=None, t_end=None, t_before=None, t_after=None, output_sampling_rate=None, include_endpoint=True, output_format='tidy', interpolate=True)
    Slices a timeseries relative to a given set of event times
    to build an event-triggered response.
    
    For example, If we have data such as a measurement of neural activity
    over time and specific events in time that we want to align
    the neural activity to, this function will extract segments of the neural
    timeseries in a specified time window around each event.
    
    The times of the events need not align with the measured
    times of the neural data.
    Relative times will be calculated by linear interpolation.
    
    Parameters:
    -----------
    data: Pandas.DataFrame
        Input dataframe in tidy format
        Each row should be one observation
        Must contains columns representing `t` and `y` (see below)
    t : string
        Name of column in data to use as time data
    y : string
        Name of column to use as y data
    event_times: Panda.Series, numpy array or list of floats
        Times of events of interest. If pd.Series, the original index and index name will be preserved in the output
        Values in column specified by `y` will be sliced and interpolated
            relative to these times
    t_start : float
        start time relative to each event for desired time window
        e.g.:   t_start = -1 would start the window 1 second before each
                t_start = 1 would start the window 1 second after each event
        Note: cannot pass both t_start and t_before
    t_before : float
        time before each of event of interest to include in each slice
        e.g.:   t_before = 1 would start the window 1 second before each event
                t_before = -1 would start the window 1 second after each event
        Note: cannot pass both t_start and t_before
    t_end : float
        end time relative to each event for desired time window
        e.g.:   t_end = 1 would end the window 1 second after each event
                t_end = -1 would end the window 1 second before each event
        Note: cannot pass both t_end and t_after
    t_after : float
        time after each event of interest to include in each slice
        e.g.:   t_after = 1 would start the window 1 second after each event
                t_after = -1 would start the window 1 second before each event
        Note: cannot pass both t_end and t_after
    output_sampling_rate : float
        Desired sampling of output.
        Input data will be interpolated to this sampling rate if interpolate = True (default). # NOQA E501
        If passing interpolate = False, the sampling rate of the input timeseries will # NOQA E501
        be used and output_sampling_rate should not be specified.
    include_endpoint : Boolean
        Passed to np.linspace to calculate relative time
        If True, stop is the last sample. Otherwise, it is not included.
            Default is True
    output_format : string
        'wide' or 'tidy' (default = 'tidy')
        if 'tidy'
            One column representing time
            One column representing event_number
            One column representing event_time
            One row per observation (# rows = len(time) x len(event_times))
        if 'wide', output format will be:
            time as indices
            One row per interpolated timepoint
            One column per event,
                with column names titled event_{EVENT NUMBER}_t={EVENT TIME}
    interpolate : Boolean
        if True (default), interpolates each response onto a common timebase
        if False, shifts each response to align indices to a common timebase
    
    Returns:
    --------
    Pandas.DataFrame
        See description in `output_format` section above
    
    Examples:
    ---------
    An example use case, recover a sinousoid from noise:
    
    First, define a time vector
    >>> t = np.arange(-10,110,0.001)
    
    Now build a dataframe with one column for time,
    and another column that is a noise-corrupted sinuosoid with period of 1
    >>> data = pd.DataFrame({
            'time': t,
            'noisy_sinusoid': np.sin(2*np.pi*t) + np.random.randn(len(t))*3
        })
    
    Now use the event_triggered_response function to get a tidy
    dataframe of the signal around every event
    
    Events will simply be generated as every 1 second interval
    starting at 0, since our period here is 1
    >>> etr = event_triggered_response(
            data,
            x = 'time',
            y = 'noisy_sinusoid',
            event_times = np.arange(100),
            t_start = -1,
            t_end = 1,
            output_sampling_rate = 100
        )
    Then use seaborn to view the result
    We're able to recover the sinusoid through averaging
    >>> import matplotlib.pyplot as plt
    >>> import seaborn as sns
    >>> fig, ax = plt.subplots()
    >>> sns.lineplot(
            data = etr,
            x='time',
            y='noisy_sinusoid',
            ax=ax
        )

Plot an event triggered response¶

The output format of the event_triggered_response function is designed to plug directly into Seaborn's lineplot plotting function. We can then view the mean response to omitted stimuli with 95% confidence intervals very easily:

In [ ]:

sns.lineplot(
    data=etr,
    x='time',
    y='dff',
    n_boot=500
)

Out[ ]:

<Axes: xlabel='time', ylabel='dff'>

Note that the regular, image-driven responses with a 750 ms inter-stimulus interval are visible everywhere except at t=0, which is when the unexpectedly omitted stimulus occurred.

Make a function to plot an event triggered average in one line¶

If we make a wrapper function that combines the process of calculating and plotting the event triggered response, it can be called in a single line below. By having event_query input variable, we can use this function to plot responses to any event of interest (omisisons, changes, hits/misses, specific images, etc)

In [ ]:

def make_event_triggered_plot(df, x, y, event_query, ax, t_before=3, t_after=3):
    etr = utilities.event_triggered_response(
      data=df,
      t='timestamps',
      y=y,
      event_times=stimulus_table.query(event_query, engine='python')['start_time'],
      t_before=t_before,
      t_after=t_before,
      output_sampling_rate=50,
      )
    sns.lineplot(
      data=etr,
      x=x,
      y=y,
      n_boot=500,
      ax=ax
      )

Now plot the omission triggered response for the same cell using filtered events (these events extracted from the deltaF/F timeseries using an event extraction algorithm, then smoothed with a half-gaussian kernel) instead of dff.

In [ ]:

cell_id = cell_ids[11]
fig, ax = plt.subplots()
make_event_triggered_plot(
    df=neural_data.query('cell_specimen_id == @cell_id'),
    x='time',
    y='filtered_events',
    event_query='omitted',
    ax=ax
)
fig.show()

Plot the responses for 10 sample cells¶

We can then iterate over 10 randomly chosen cells and plot their activity during omissions.

In [ ]:

np.random.seed(0)
fig, ax = plt.subplots()
for cell_id in tqdm(np.random.choice(cell_ids, size=10, replace=False)):
    make_event_triggered_plot(
      df=neural_data.query('cell_specimen_id == @cell_id'),
      x='time',
      y='dff',
      event_query='omitted',
      ax=ax
      )
fig.show()

100%|██████████| 10/10 [00:29<00:00,  2.96s/it]

Interestingly, not all SST cells in this session do the same thing!

Calculate the mean response for each of the individual imaging planes in this experiment¶

By iterating over experiment IDs, we can also calculate the mean response for each of the 6 imaging planes. Do Sst cells in different visual areas respond to omissions in a distinct way?

We will first use a Pandas groupby and mean operations to get the mean timeseries for each cell in that imaging plane:

In [ ]:

mean_dff_by_experiment = (
    neural_data
    .groupby(['ophys_experiment_id','timestamps'])['dff']
    .mean()
    .reset_index()
    )

In [ ]:

mean_dff_by_experiment.head()

Out[ ]:

	ophys_experiment_id	timestamps	dff
0	854759890	10.52216	0.387612
1	854759890	10.61538	0.203569
2	854759890	10.70860	0.035257
3	854759890	10.80182	0.357586
4	854759890	10.89504	0.146397

We can then iterate over our 6 experiment IDs and use our make_event_triggered_plot wrapper function to calculate and plot the omission triggered response for that imaging plane:

In [ ]:

# set up a new figure and axis
fig, ax = plt.subplots()

# make an empty list that we will fill with strings for the legend
legend_text = []

# iterate over every `ophys_experiment_id`
for ophys_experiment_id in tqdm(ophys_experiment_ids):
    make_event_triggered_plot(
      df=mean_dff_by_experiment.query('ophys_experiment_id == @ophys_experiment_id'),
      x='time',
      y='dff',
      event_query='omitted',
      ax=ax
      )

    # get some metadata to add to the legend
    this_exp = neural_data.query('ophys_experiment_id == @ophys_experiment_id')
    structure = this_exp['targeted_structure'].iloc[0]
    depth = this_exp['imaging_depth'].iloc[0]
    # append a string to our list of legend text
    legend_text.append('structure = {}\ndepth = {} um'.format(structure, depth))

# Put the legend out of the figure
plt.legend(legend_text, bbox_to_anchor=(1.05, 1))
fig.show()

100%|██████████| 6/6 [00:17<00:00,  2.99s/it]

There are clearly some large differences in the way that Sst cells respond to these unexpected stimulus omissions by area.

This example could be extended to include cells from the other two cre-lines in the dataset: The VIP-Cre line which labels VIP+ inhibitory interneurons and the Slc17a7 line, which is a pan-excitatory line.

In [ ]:

session_table['cre_line'].unique()

Out[ ]:

array(['Sst-IRES-Cre', 'Vip-IRES-Cre', 'Slc17a7-IRES2-Cre'], dtype=object)

In addition, responses to different stimuli could be explored, along with responses relative to other behavioral measures, such as licking.

For a full description of the dataset and all available data streams, see the Visual Behavior Project Description at: https://portal.brain-map.org/explore/circuits/visual-behavior-2p

Set up data for scikit learn¶

What if we wanted to use scikit-learn for a decoding or clustering analysis? We'd need to get the data into a standard format for scikit learn, which is often a feature matrix (X) and a vector of labels (y).

Instead of just omissions, let's now look at the responses to each of the stimuli in this session, which consists of 8 unique images, plus the omitted stimuli (which we characterize as a unique stimulus type). First, we will calculate an event triggered response to each stimulus start time in the stimulus table.

In [ ]:

full_etr_l = []
# iterate over each unique cell
for cell_specimen_id in tqdm(neural_data['cell_specimen_id'].unique()):
  # calculate the event triggered response for this cell to every stimulus
  full_etr_this_cell = utilities.event_triggered_response(
      neural_data.query('cell_specimen_id == @cell_specimen_id'),
      t='timestamps',
      y='dff',
      event_times=stimulus_table['start_time'],
      t_before=0,
      t_after=0.75,
      output_sampling_rate=30
  )
  # add a column identifying the cell_specimen_id
  full_etr_this_cell['cell_specimen_id'] = cell_specimen_id
  # append to our list
  full_etr_l.append(full_etr_this_cell)

# concatenate our list of dataframes into a single dataframe
full_etr = pd.concat(full_etr_l)

# cast these numeric columns to int and float, respectively
full_etr['event_number'] = full_etr['event_number'].astype(int)
full_etr['event_time'] = full_etr['event_number'].astype(float)

# rename 'event_number' as
# full_etr.rename(columns={'event_number': 'stimulus_presentations_id'}, inplace=True)

100%|██████████| 53/53 [00:55<00:00,  1.05s/it]

One way to construct a feature matrix might be to build it such that dimensions are trials x cells. Thus:

Each row would be one trial, where a trial is defined as a unique image presentation
Each column would represent the average response of a given cell on that image presentation.

To do so, let's construct another intermediate dataframe called average_responses that contains the average response of each cell (in the 750 ms window we've selected above) to each image presentation. We'll do this using a Pandas groupby to group by cell_specimen_id and stimulus_presentations_id (aka trial).

We're also going to merge in our stimulus metadata.

In [ ]:

full_etr['event_number'] = full_etr['event_number'].astype(int)
full_etr['event_time'] = full_etr['event_number'].astype(float)

In [ ]:

average_responses = full_etr.groupby(['cell_specimen_id', 'stimulus_presentations_id'])[['dff']].mean().reset_index().merge(
    stimulus_table,
    on='stimulus_presentations_id',
    how='left'
)
average_responses

Out[ ]:

	cell_specimen_id	stimulus_presentations_id	dff	stimulus_block	stimulus_block_name	image_index	image_name	movie_frame_index	duration	start_time	end_time	start_frame	end_frame	is_change	is_image_novel	omitted	movie_repeat	flashes_since_change	trials_id	stimulus_name	is_sham_change	active
0	1086547630	0	-0.261392	1	change_detection_behavior	0	im000	-99	0.25021	310.569786	310.819996	17985	18000	False	False	False	-99	1	0	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
1	1086547630	1	0.354413	1	change_detection_behavior	0	im000	-99	0.25020	311.320396	311.570596	18030	18045	False	False	False	-99	2	0	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
2	1086547630	2	0.374132	1	change_detection_behavior	0	im000	-99	0.25017	312.071016	312.321186	18075	18090	False	False	False	-99	3	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
3	1086547630	3	0.171823	1	change_detection_behavior	0	im000	-99	0.25019	312.821616	313.071806	18120	18135	False	False	False	-99	4	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
4	1086547630	4	-0.065775	1	change_detection_behavior	0	im000	-99	0.25021	313.572196	313.822406	18165	18180	False	False	False	-99	5	1	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
254607	1086560061	4799	-0.007224	1	change_detection_behavior	4	im045	-99	0.25022	3913.578346	3913.828566	233989	234004	False	False	False	-99	1	430	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
254608	1086560061	4800	-0.011908	1	change_detection_behavior	4	im045	-99	0.25018	3914.328936	3914.579116	234034	234049	False	False	False	-99	2	430	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
254609	1086560061	4801	0.013366	1	change_detection_behavior	4	im045	-99	0.25019	3915.079546	3915.329736	234079	234094	False	False	False	-99	3	430	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
254610	1086560061	4802	-0.022811	1	change_detection_behavior	4	im045	-99	0.25023	3915.830156	3916.080386	234124	234139	False	False	False	-99	4	430	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True
254611	1086560061	4803	0.029136	1	change_detection_behavior	4	im045	-99	0.25025	3916.580756	3916.831006	234169	234184	False	False	False	-99	5	430	Natural_Images_Lum_Matched_set_ophys_6_2017	False	True

254612 rows × 22 columns

Now we can construct a dataframe called features_and_labels that will contain one row per trial, one column per cell, plus columns with the image_index and image_name

In [ ]:

features_and_labels = average_responses.pivot(
    index='stimulus_presentations_id',
    columns='cell_specimen_id',
    values='dff'
).merge(
    stimulus_table[['image_index','image_name']],
    on='stimulus_presentations_id',
    how='left'
)
features_and_labels.sample(10)

Out[ ]:

	1086547630	1086547796	1086547993	1086548072	1086548118	1086548658	1086548969	1086549303	1086549491	1086549726	1086549813	1086549949	1086550481	1086550544	1086550990	1086551114	1086551151	1086551209	1086551301	1086551457	1086551540	1086551645	1086552296	1086552709	1086553271	1086553602	1086553836	1086553899	1086554566	1086555222	1086555553	1086555835	1086556317	1086556532	1086556653	1086557083	1086557208	1086557304	1086557434	1086557470	1086557639	1086558071	1086558114	1086558224	1086558510	1086558574	1086558701	1086559064	1086559206	1086559681	1086559885	1086559968	1086560061	image_index	image_name
stimulus_presentations_id
2802	0.188744	-0.008572	-0.011187	0.068998	0.004372	0.040266	0.000542	0.107748	0.027901	0.017652	-0.008583	-0.001325	0.221544	-0.003638	-0.011980	-0.032854	0.025846	0.001994	0.114461	0.055439	6.771504	0.000572	0.028877	1.897729	0.098143	0.578993	-0.010733	0.018567	-0.029723	0.035030	0.067603	-0.011919	-0.004589	-0.044386	0.134312	0.001550	0.035612	0.003460	0.033602	0.155450	0.029361	0.007124	-0.021153	0.013226	0.096590	0.010745	0.081815	0.342996	-0.034027	0.040853	-0.025462	-0.015748	0.045814	4	im045
873	0.423262	0.195502	0.131998	-0.019266	0.038702	0.034431	0.002327	-0.057758	0.037060	0.006215	0.091679	0.025424	0.053717	0.020017	0.047703	0.334743	0.044308	0.033760	0.012921	0.029380	0.918336	-0.038116	-0.028756	-0.025687	0.064217	-0.004619	0.690495	-0.042575	-0.008669	-0.019768	-0.012421	0.012025	0.069876	0.026988	-0.100992	-0.002765	-0.029208	0.025067	-0.074094	-0.036319	-0.001228	0.008422	0.059426	-0.050069	0.047464	0.022979	0.056264	-0.016134	0.027653	0.008519	-0.076631	0.019184	-0.046165	2	im075
1428	-0.031489	-0.021622	-0.084964	0.012929	-0.020635	0.020563	-0.093702	-0.077242	0.363789	-0.040719	-0.111365	-0.009501	0.002781	-0.047153	-0.061137	-0.047434	-0.045629	-0.031159	-0.196107	0.014383	-1.046354	-0.061302	-0.015688	-0.139933	-0.075276	-0.023855	0.049805	0.020142	0.018077	0.044969	-0.046662	-0.028479	-0.005956	-0.002147	-0.286239	-0.103877	-0.082403	-0.030807	-0.009166	-0.029450	-0.039558	-0.045139	-0.049476	-0.028388	-0.364373	-0.038763	-0.004181	-0.181343	-0.021466	-0.117293	-0.006498	-0.168274	0.006674	8	omitted
502	0.352991	0.002643	-0.072307	0.006717	0.034363	-0.031389	0.041112	0.008856	-0.001766	-0.004686	0.041457	0.012696	0.012373	-0.012404	0.020623	1.669691	-0.025821	0.007340	0.009833	0.019995	-1.255134	-0.025441	0.005716	0.061934	0.010198	-0.839118	-0.419708	-0.010616	-0.001557	-0.016340	0.021572	0.008620	0.020470	0.023473	-0.197800	-0.037609	0.014506	-0.024180	-0.081081	0.015720	0.018417	0.018045	0.053760	-0.003124	-0.163397	-0.008594	0.014436	-0.024876	0.062666	-0.019565	0.001800	-0.030068	-0.001519	1	im106
831	-0.112112	0.027243	0.119985	-0.082399	0.026680	0.031627	-0.010097	-0.059420	-0.013806	0.011196	0.076926	0.006937	0.233008	-0.005911	0.006878	-0.024285	0.017898	0.009467	0.010462	0.024603	1.790705	-0.041758	0.014132	-0.024213	-0.004381	1.972306	-0.030419	0.028825	0.004494	-0.033373	0.063855	-0.022910	0.035961	0.016078	-0.786663	-0.027895	-0.023252	-0.041180	0.625850	0.005769	-0.042773	0.011269	0.014166	0.012780	0.162466	0.062172	0.115084	0.023663	0.019582	-0.012661	-0.043740	0.031887	-0.042232	6	im031
142	-0.213124	-0.011856	-0.066075	0.040839	0.020031	0.030649	-0.007731	-0.003524	0.034375	0.007953	-0.005359	0.023818	0.001401	-0.008858	0.015069	0.004903	0.004793	-0.007019	0.009715	-0.018836	-0.919745	0.017516	0.032806	0.038849	0.013616	-1.090188	-0.153180	-0.029665	-0.079135	0.017541	-0.037542	-0.024838	0.079564	-0.013713	-0.594666	0.114349	0.939516	0.043889	0.026208	0.020728	0.051941	-0.002208	0.127695	0.015951	1.721376	0.005044	-0.001263	-0.073136	-0.000326	0.096125	0.064020	-0.069338	0.202951	0	im000
872	-0.365919	0.128934	0.126414	-0.016161	-0.000917	-0.016184	0.180447	0.061008	-0.041748	0.003082	-0.080349	0.000304	-0.065836	-0.069230	-0.021769	2.336083	0.029246	-0.055579	0.035007	-0.013510	-0.394122	0.018768	-0.016827	-0.027958	-0.016966	-0.219451	-0.524029	0.079444	0.087610	-0.014453	0.032903	-0.005947	-0.041918	0.038773	0.551123	0.020741	0.022598	0.001982	0.009395	0.040839	0.004418	0.007271	0.021044	0.010126	0.098769	0.012831	0.030229	0.020644	0.052160	-0.005155	0.027164	-0.020613	0.022814	1	im106
3561	-0.265613	-0.023161	-0.047186	0.007225	-0.004948	-0.013870	-0.023019	-0.175991	0.316958	-0.002102	-0.143713	-0.030722	-0.016735	-0.026546	0.006684	0.017497	0.026821	-0.029001	-0.027553	-0.045317	-1.104999	0.170058	-0.000773	0.021314	-0.013961	-1.749390	0.059442	0.018309	0.035344	-0.032749	-0.068363	-0.023550	-0.028478	0.007259	0.101109	0.014096	-0.050271	0.006265	0.097412	-0.115290	-0.031813	-0.029665	-0.019010	-0.001997	-0.061160	0.027799	-0.057226	-0.021368	-0.002826	0.006638	0.023953	0.065672	-0.011448	7	im035
1894	0.044039	-0.008075	-0.048363	0.012422	0.041968	0.036341	-0.012176	0.021193	-0.019818	-0.000130	0.004117	-0.029039	-0.025925	0.014631	-0.030741	-0.043443	-0.015275	-0.018793	0.007822	-0.007793	0.034497	0.017451	-0.008299	0.022712	0.026609	-0.152381	-0.005640	0.001168	-0.029618	0.048389	0.031650	0.003746	-0.050091	-0.000465	0.097340	0.051821	0.305809	-0.002264	0.052070	0.028950	0.044793	0.024830	0.078653	-0.029171	0.138318	-0.028110	-0.034970	-0.027078	0.003079	0.122274	0.011370	0.035782	-0.008515	0	im000
4616	0.565318	-0.023920	-0.008095	0.089502	0.007752	0.021257	0.016870	0.033922	-0.050130	0.012862	-0.028429	0.018301	0.476721	0.014091	0.074191	-0.020188	-0.037041	0.126370	0.347974	0.014032	3.654675	-0.057374	-0.002822	1.853332	0.246524	4.660115	0.118221	0.019667	-0.016767	0.053885	0.154902	0.040081	-0.031475	-0.034351	-0.385169	0.000795	0.004548	-0.032679	0.266178	-0.108577	0.055834	0.028215	0.025779	0.017060	0.033552	0.256883	0.019094	0.011179	0.021042	0.020139	-0.037456	-0.012039	0.025785	4	im045

The X matrix can be extracted by getting the columns associated with the cell_specimen_ids

In [ ]:

X = features_and_labels[cell_ids]
X.sample(10)

Out[ ]:

	1086550481	1086551114	1086551301	1086557083	1086557639	1086559064	1086558114	1086558224	1086558510	1086559206	1086557304	1086557208	1086560061	1086559681	1086559885	1086559968	1086557470	1086547796	1086547993	1086548118	1086554566	1086556653	1086558574	1086552296	1086558071	1086556532	1086555222	1086558701	1086557434	1086556317	1086555835	1086549726	1086553836	1086551540	1086551151	1086550544	1086552709	1086553271	1086553602	1086555553	1086548072	1086553899	1086547630	1086549303	1086549491	1086549813	1086549949	1086548658	1086548969	1086551457	1086551645	1086550990	1086551209
stimulus_presentations_id
4189	0.089903	0.024906	-0.046085	-0.034460	0.002261	-0.020595	0.071774	-0.002502	-0.020789	0.001684	0.008464	-0.019575	0.049000	-0.035157	-0.003666	-0.015179	-0.107282	-0.024429	0.006634	-0.001652	0.056245	1.300125	-0.009489	0.008088	0.015932	0.010329	0.056795	0.006799	-0.013974	-0.007446	0.030644	-0.004067	0.062614	5.642910	0.004174	0.001658	0.045159	0.015444	4.794847	0.064946	-0.039260	0.055422	1.271359	-0.001104	0.002660	0.062576	0.073505	0.001974	0.000101	0.143271	0.059929	-0.012726	-0.028438
250	0.050824	0.041704	-0.001928	0.026821	0.068230	0.004078	-0.005798	0.012790	0.056719	-0.004859	-0.030977	-0.032491	0.055306	-0.002700	0.011684	-0.014231	-0.013565	0.157667	-0.002660	0.012759	0.037215	1.163865	-0.042082	0.000762	-0.004521	0.000295	0.025304	0.025074	0.110253	-0.003577	-0.021062	-0.002031	-0.042702	0.210648	0.029303	0.024987	0.029177	-0.007469	-0.085974	0.003285	0.008513	0.019579	-0.028265	-0.001836	0.024202	0.041959	0.022740	0.008251	0.023698	0.029016	-0.022381	0.035931	-0.014534
4348	0.002091	0.021157	-0.052133	0.067661	0.021547	0.105207	0.063571	-0.012539	0.688057	0.227734	-0.016318	0.085618	0.022404	-0.013892	-0.001535	0.040991	0.048995	0.009405	-0.065042	0.011419	0.065410	8.538500	-0.046427	0.209997	-0.037722	0.039192	0.032267	0.047392	0.020943	0.140314	0.056047	-0.014708	0.073953	3.044143	0.071692	0.024293	0.017528	0.029569	6.945174	-0.046595	-0.050118	0.181107	0.006423	0.076210	-0.002631	0.101224	-0.053285	0.037069	0.032636	0.002911	-0.055538	-0.023025	0.022613
2791	0.040069	-0.051707	0.522409	0.005746	-0.036767	0.013612	0.007747	0.021827	0.220459	0.013104	-0.059429	0.006422	0.017736	-0.022348	0.024529	0.017214	0.129793	0.000732	0.026058	-0.002066	0.028572	-0.207911	0.011993	0.118774	-0.012681	0.005632	-0.027632	0.022893	0.030939	0.048363	-0.000151	-0.004408	0.044600	-0.108098	-0.047453	-0.020662	0.081641	-0.001765	-0.634543	-0.003440	0.134448	0.051929	-0.045280	-0.013557	-0.024167	-0.054341	-0.029107	-0.012474	0.043636	0.007564	0.074074	0.055662	0.115290
2275	-0.041459	0.011008	-0.055237	0.047922	0.033297	0.022879	0.000088	0.003876	-0.152558	-0.018586	-0.029864	-0.005584	0.080266	0.015952	0.052953	0.304258	0.000763	0.036425	0.027420	-0.002664	0.001505	0.206638	-0.005536	0.014174	-0.048500	0.003640	0.054400	0.084964	-0.079203	-0.002987	-0.021273	0.011225	0.081400	9.160343	0.002530	-0.004434	0.084411	-0.034244	3.382523	-0.076132	0.019194	0.007613	0.793798	0.016355	-0.021784	-0.118896	-0.036156	-0.029250	0.008419	-0.041697	0.029119	0.057440	-0.000388
3364	0.001590	0.029114	0.042443	0.026074	0.015629	0.073447	0.041900	-0.013303	0.676601	-0.043063	-0.067287	0.030950	0.020750	0.131443	-0.005691	0.051818	0.044800	-0.022882	0.052791	0.004855	0.006176	0.276550	-0.011573	0.117979	0.087262	-0.063567	-0.035908	0.063256	0.187598	-0.026538	-0.023886	0.026383	-0.093770	3.343860	0.135942	-0.005243	0.105701	0.042795	3.330865	0.122585	0.279359	-0.038560	0.094401	0.058020	0.921130	0.017249	0.074355	-0.028635	0.155254	0.064325	0.096806	0.078105	-0.023347
757	0.112625	0.010699	0.073294	0.049768	-0.023770	0.041888	-0.012735	0.001733	-0.090540	0.015469	-0.038199	0.041567	-0.045520	0.043685	0.033780	0.012523	0.109042	0.044897	0.005211	-0.001340	0.027616	-0.468245	0.015574	-0.019447	0.003821	0.023700	0.070535	-0.001122	-0.166997	0.303648	-0.061507	-0.000077	0.286029	0.148413	0.046874	-0.018950	0.042764	0.091724	-0.228675	-0.008245	0.064719	0.005560	0.844696	0.060548	0.061952	0.296216	0.006011	0.028035	-0.032624	-0.001236	0.109000	0.060241	0.108687
1231	0.123487	0.012015	0.074686	-0.057958	-0.062489	-0.005719	-0.018863	0.001873	-0.098702	-0.080393	0.000174	-0.011449	0.003779	0.140177	-0.085526	-0.002940	-0.034404	0.018371	0.066341	0.029739	-0.001750	-1.079020	0.018562	-0.029877	0.012423	-0.036075	0.014829	-0.081603	0.049143	-0.048742	0.000211	0.011385	0.118326	18.285403	0.161440	0.022849	0.023001	-0.016845	3.984924	0.105866	-0.100968	0.022498	-0.065443	0.112538	-0.007311	-0.010751	-0.044869	-0.029348	0.209742	-0.017102	0.059553	0.094675	0.002495
2861	0.006513	0.032145	0.041042	-0.004228	-0.028449	-0.035138	-0.062131	0.024852	-0.018202	0.017064	0.060212	0.023772	0.017543	-0.024719	-0.037359	-0.009053	-0.038685	-0.019768	-0.009433	0.008406	-0.001899	-0.498624	-0.002557	0.045878	-0.025913	-0.011032	0.016700	0.026807	0.002828	-0.028792	0.005283	-0.015849	10.992483	0.171739	-0.003859	-0.023945	-0.010824	-0.009832	-0.313828	-0.006598	0.032851	0.043762	-0.370227	-0.058354	0.001721	0.001313	0.038282	0.030872	0.020568	0.033622	-0.072347	0.012223	-0.043466
819	-0.074397	4.207817	-0.056807	-0.002954	0.009688	-0.023455	0.033242	0.042787	0.125576	-0.009704	0.022180	0.032723	-0.005124	-0.047685	-0.042669	-0.060779	0.012423	0.011754	0.133128	0.024596	0.114497	4.696098	-0.022374	0.007295	0.039370	-0.018408	-0.010090	0.072236	1.006231	-0.007267	0.018647	-0.036074	0.042228	0.000105	0.058423	-0.030346	0.020053	0.026313	0.569307	-0.038953	0.005391	0.014134	0.141457	0.355020	0.025469	0.069253	0.020161	0.001198	0.328697	0.024548	-0.010794	0.002652	-0.013257

And y is just the image_name column (it could also be the image_index column if you want a numeric value instead of a string to represent the image identity)

In [ ]:

y = features_and_labels['image_name']
y.sample(10)

Out[ ]:

stimulus_presentations_id
3537    im073
2109    im000
34      im031
489     im106
4206    im106
3903    im054
1618    im035
1385    im106
4786    im106
2795    im054
Name: image_name, dtype: object

Dimensionality reduction¶

Now we can use t-SNE, which will project our 53-dimensional feature space (53 neurons in the session) into two dimensions.

In [ ]:

X_embedded = TSNE(n_components=2).fit_transform(X.values)

And visualize the results, with colors representing each unique stimulus.

In [ ]:

features_and_labels['tsne-2d-one'] = X_embedded[:, 0]
features_and_labels['tsne-2d-two'] = X_embedded[:, 1]
plt.figure(figsize=(16, 10))
ax = sns.scatterplot(
    data=features_and_labels,
    x="tsne-2d-one",
    y="tsne-2d-two",
    hue="image_name",
    hue_order=np.sort(features_and_labels['image_name'].unique()),
    palette=sns.color_palette()[:9],
    legend="full",
    alpha=0.3
)

This demonstrates that the time-averaged population responses to at least some of the stimuli seem to fall into distinct clusters in our 53-dimensional space, while others appear more overlapped. This implies that a decoding analysis might be more successful at decoding some stimuli than others.

Train a simple decoder¶

We can use an SVM decoder from scikit learn to ask how well we can decode image identity from the feature matrix we have constructed.

Split our data into train and test sets, instantiate the model, then fit.

In [ ]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
model = svm.SVC(probability=True)
model.fit(X_train, y_train)

Out[ ]:

SVC(probability=True)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Use the model to make predictions on the held-out test set

In [ ]:

y_pred = model.predict(X_test)

Evaluate the accuracy

In [ ]:

accuracy_score(y_test, y_pred)

Out[ ]:

0.6216897856242118

Evaluate the confusion matrix

In [ ]:

pd.DataFrame(
    confusion_matrix(y_test, y_pred),
    columns=['predicted_{}'.format(im) for im in model.classes_],
    index=['actual_{}'.format(im) for im in model.classes_]
)

Out[ ]:

	predicted_im000	predicted_im031	predicted_im035	predicted_im045	predicted_im054	predicted_im073	predicted_im075	predicted_im106	predicted_omitted
actual_im000	92	4	55	1	19	26	0	0	0
actual_im031	17	86	36	0	7	34	0	0	2
actual_im035	5	19	124	0	5	23	0	2	0
actual_im045	0	18	7	161	0	14	0	0	1
actual_im054	35	7	26	0	94	24	0	0	0
actual_im073	9	14	48	0	4	100	0	1	0
actual_im075	2	9	21	0	2	35	124	1	0
actual_im106	0	5	8	0	1	10	1	194	0
actual_omitted	9	11	8	1	6	6	0	1	11

This tells us that the model can decode some stimuli well (im035, im075 and im106, for example), while it struggles more with others (im000 and omissions, for example). Do the stimuli that the decoder succeeds in classifying align with those that cluster cleanly in t-SNE space?

Follow up exercise¶

Can you create event triggered averages and perform decoding using other events of interest, such as licks or rewards?

In [ ]:

# Lick and reward data are available for each experiment
licks = experiments[ophys_experiment_id].licks
licks.head()

Out[ ]:

	timestamps	frame
0	68.90307	3499
1	77.14313	3993
2	84.09879	4410
3	85.31647	4483
4	94.64071	5042

In [ ]:

rewards = experiments[ophys_experiment_id].rewards
rewards.head()

Out[ ]:

	volume	timestamps	auto_rewarded
0	0.005	318.95740	True
1	0.005	328.69873	True
2	0.005	337.73943	True
3	0.005	354.25289	True
4	0.005	364.74479	True

To see the full list of all attributes available for each experiment via the AllenSDK, uncomment the cell below and run it

In [ ]:

# help(experiments[ophys_experiment_id])

In [ ]:

Overview¶

Set up environment and import packages¶

Next we will import packages we need later in the notebook¶

Load the session and experiment summary tables¶

Load one example session¶

Download all associated experiments¶

View the max projection and one cell ROI for one of the experiments¶

Load neural data into memory¶

Examine Cell IDs¶

Load stimulus data into memory¶

View the stimulus_templates attribute¶

View the unwarped images¶

View the warped images¶

Describe stimulus omissions¶

Create an event triggered response dataframe relative to omissions¶

Plot an event triggered response¶

Make a function to plot an event triggered average in one line¶

Plot the responses for 10 sample cells¶

Calculate the mean response for each of the individual imaging planes in this experiment¶

Set up data for scikit learn¶

Dimensionality reduction¶

Train a simple decoder¶

Follow up exercise¶

View the `stimulus_templates` attribute¶