This notebook shows how to access and interact with 2-photon calcium imaging data collected as part of the Allen Institute's Visual Behavior 2P project.
You can learn more about this dataset, behavioral task, and find other useful tools here: Overview page and Allen Brain Atlas
Specifically, this notebook will show how to load neural data for all imaging planes in one 2-photon imaging session into a single 'tidy' dataframe, make simple event-triggered plots, and do some basic analysis using scikit-learn.
This is designed to demonstrate a simple method for interacting with the Visual Behavior 2P data. Many aspects of the dataset are not explored here.
We have built a package called brain_observatory_utilities
which contains some useful convenience functions. The allenSDK
is a dependency of this package and will be automatically installed when you install brain_observatory_utilities
per the instructions below.
We will first install brain_observatory_utilities
into our colab environment by running the commands below. When this cell is complete, click on the RESTART RUNTIME
button that appears at the end of the output. Note that running this cell will produce a long list of outputs and some error messages. Clicking RESTART RUNTIME
at the end will resolve these issues.
You can minimize the cell after you are done to hide the output.
# @title Install packages
!pip install pip --upgrade --quiet
!pip install brain_observatory_utilities --upgrade --quiet
!pip install pandas --quiet
!pip install seaborn --quiet
import os
import numpy as np
import pandas as pd
from tqdm import tqdm
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.manifold import TSNE
import brain_observatory_utilities.datasets.optical_physiology.data_formatting as ophys_formatting
import brain_observatory_utilities.utilities.general_utilities as utilities
from allensdk.brain_observatory.behavior.behavior_project_cache import VisualBehaviorOphysProjectCache
pd.set_option('display.max_columns', 500)
# this line may be needed if you run into Error in pandas query function
# Otherwise set the engine to python in queries made throughout the book
# pd.DataFrame.query = lambda self, expr, **kwargs: self.query(expr, engine='python', **kwargs)
The AllenSDK provides functionality for downloading tables that describe all sessions and experiments (individual imaging planes) in the Visual Behavior 2P dataset. We first download the data cache:
data_storage_directory = "./temp" # Note: this path must exist on your local drive
cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)
ophys_session_table.csv: 100%|██████████| 247k/247k [00:00<00:00, 2.32MMB/s] behavior_session_table.csv: 100%|██████████| 1.59M/1.59M [00:00<00:00, 3.54MMB/s] ophys_experiment_table.csv: 100%|██████████| 657k/657k [00:00<00:00, 3.22MMB/s] ophys_cells_table.csv: 100%|██████████| 4.28M/4.28M [00:00<00:00, 6.32MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/brain_observatory/behavior/behavior_project_cache/behavior_project_cache.py:135: UpdatedStimulusPresentationTableWarning: As of AllenSDK version 2.16.0, the latest Visual Behavior Ophys data has been significantly updated from previous releases. Specifically the user will need to update all processing of the stimulus_presentations tables. These tables now include multiple stimulus types delineated by the columns `stimulus_block` and `stimulus_block_name`. The data that was available in previous releases are stored in the block name containing 'change_detection' and can be accessed in the pandas table by using: `stimulus_presentations[stimulus_presentations.stimulus_block_name.str.contains('change_detection')]` warnings.warn(
Ophys_session_table
contains metadata describing imaging sessions. If more than one plane was imaged during a session, one ophys session id will be associated multiple ophys experiment ids. Each ophys session id will also have a unique behavior session id.Behavior_session_table
contains metadata describing behavioral sessions, which may or may not be during imaging. Behavior session ids that do not have ophys session ids were training sessions.Ophys_experiment_table
contains metadata describing imaging experiments (aka imaging planes). When mesoscope is used, one ophys session may contain up to 8 unique experiments (two visual areas by four imaging depths). Some imaging planes may not be released due to quality control issues, thus each ophys session id is associated with anywhere from one to eight unique experiment ids. Ophys experiment ids are unique and do not repeat across sessions. To find the same imaging plane that was matched across multiple sessions, use the ophys_container_id
column that can be found in both ophys_session_table
and ophys_experiment_table
.Then we can access the session and experiment tables directly.
Note that a 'session' is a single behavioral session. Sessions that are performed on the mesoscope will have multiple (up to 8) 'experiments' associated with them, where an experiment is a distinct imaging plane.
session_table = cache.get_ophys_session_table()
experiment_table = cache.get_ophys_experiment_table()
We can then view the contents of the session table. Note that this contains a lot of useful metadata about each session. One of the columns, ophys_experiment_id
provides a list of the experiments (aka imaging planes) that are associated with each session.
session_table.head()
behavior_session_id | ophys_container_id | mouse_id | indicator | full_genotype | driver_line | cre_line | reporter_line | sex | age_in_days | imaging_plane_group_count | project_code | session_type | session_number | image_set | behavior_type | experience_level | prior_exposures_to_session_type | prior_exposures_to_image_set | prior_exposures_to_omissions | date_of_acquisition | equipment_name | num_depths_per_area | ophys_experiment_id | num_targeted_structures | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ophys_session_id | |||||||||||||||||||||||||
951410079 | 951520319 | [1018028339, 1018028342, 1018028345, 101802835... | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 206 | 4 | VisualBehaviorMultiscope | OPHYS_1_images_A | 1 | images_A | active_behavior | Familiar | 0 | 65 | 0 | 2019-09-20 09:59:38.837000+00:00 | MESO.1 | 4 | [951980471, 951980473, 951980475, 951980479, 9... | 2 |
952430817 | 952554548 | [1018028339, 1018028345, 1018028354, 1018028357] | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 209 | 3 | VisualBehaviorMultiscope | OPHYS_2_images_A_passive | 2 | images_A | passive_viewing | Familiar | 0 | 66 | 1 | 2019-09-23 08:45:38.490000+00:00 | MESO.1 | 4 | [953659743, 953659745, 953659749, 953659752] | 2 |
954954402 | 953982960 | [1018028339, 1018028342, 1018028345, 101802835... | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 210 | 4 | VisualBehaviorMultiscope | OPHYS_3_images_A | 3 | images_A | active_behavior | Familiar | 0 | 67 | 2 | 2019-09-24 09:01:31.582000+00:00 | MESO.1 | 4 | [958527464, 958527471, 958527474, 958527479, 9... | 2 |
955775716 | 956010809 | [1018028339, 1018028342, 1018028345] | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 212 | 2 | VisualBehaviorMultiscope | OPHYS_3_images_A | 3 | images_A | active_behavior | Familiar | 1 | 68 | 3 | 2019-09-26 09:22:21.772000+00:00 | MESO.1 | 4 | [956941841, 956941844, 956941846] | 2 |
957020350 | 957032492 | [1018028339, 1018028342, 1018028345, 101802835... | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 213 | 4 | VisualBehaviorMultiscope | OPHYS_4_images_B | 4 | images_B | active_behavior | Novel 1 | 0 | 0 | 4 | 2019-09-27 08:58:37.005000+00:00 | MESO.1 | 4 | [957759562, 957759564, 957759566, 957759570, 9... | 2 |
The experiment table has one row per experiment. Note that the ophys_session_id
column links each experiment to its associated session in the session_table.
experiment_table.head()
behavior_session_id | ophys_session_id | ophys_container_id | mouse_id | indicator | full_genotype | driver_line | cre_line | reporter_line | sex | age_in_days | imaging_depth | targeted_structure | targeted_imaging_depth | imaging_plane_group | project_code | session_type | session_number | image_set | behavior_type | passive | experience_level | prior_exposures_to_session_type | prior_exposures_to_image_set | prior_exposures_to_omissions | date_of_acquisition | equipment_name | published_at | isi_experiment_id | file_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ophys_experiment_id | ||||||||||||||||||||||||||||||
951980471 | 951520319 | 951410079 | 1018028342 | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 206 | 150 | VISp | 150 | 0 | VisualBehaviorMultiscope | OPHYS_1_images_A | 1 | A | active_behavior | False | Familiar | 0 | 65 | 0 | 2019-09-20 09:59:38.837000+00:00 | MESO.1 | 2021-03-25 | 848974280 | 0 |
951980473 | 951520319 | 951410079 | 1018028345 | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 206 | 225 | VISp | 225 | 0 | VisualBehaviorMultiscope | OPHYS_1_images_A | 1 | A | active_behavior | False | Familiar | 0 | 65 | 0 | 2019-09-20 09:59:38.837000+00:00 | MESO.1 | 2021-03-25 | 848974280 | 1 |
951980475 | 951520319 | 951410079 | 1018028339 | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 206 | 75 | VISp | 75 | 1 | VisualBehaviorMultiscope | OPHYS_1_images_A | 1 | A | active_behavior | False | Familiar | 0 | 65 | 0 | 2019-09-20 09:59:38.837000+00:00 | MESO.1 | 2021-03-25 | 848974280 | 2 |
951980479 | 951520319 | 951410079 | 1018028354 | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 206 | 150 | VISl | 150 | 2 | VisualBehaviorMultiscope | OPHYS_1_images_A | 1 | A | active_behavior | False | Familiar | 0 | 65 | 0 | 2019-09-20 09:59:38.837000+00:00 | MESO.1 | 2021-03-25 | 848974280 | 3 |
951980481 | 951520319 | 951410079 | 1018028357 | 457841 | GCaMP6f | Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt | [Sst-IRES-Cre] | Sst-IRES-Cre | Ai148(TIT2L-GC6f-ICL-tTA2) | F | 206 | 225 | VISl | 225 | 2 | VisualBehaviorMultiscope | OPHYS_1_images_A | 1 | A | active_behavior | False | Familiar | 0 | 65 | 0 | 2019-09-20 09:59:38.837000+00:00 | MESO.1 | 2021-03-25 | 848974280 | 4 |
We are going to select one session from this table, session 854060305. This is a session with Sst-IRES-Cre mouse, which expressed GCaMP6f in Sst+ inhibitory interneurons. There were 6 simultaneously acquired imaging planes for this session. We can view metadata for this session as follows:
ophys_session_id = 854060305
session_table.loc[ophys_session_id]
behavior_session_id 854283407 ophys_container_id [1018028135, 1018028138, 1018028141, 101802814... mouse_id 440631 indicator GCaMP6f full_genotype Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt driver_line [Sst-IRES-Cre] cre_line Sst-IRES-Cre reporter_line Ai148(TIT2L-GC6f-ICL-tTA2) sex M age_in_days 129 imaging_plane_group_count 3 project_code VisualBehaviorMultiscope session_type OPHYS_6_images_B session_number 6 image_set images_B behavior_type active_behavior experience_level Novel >1 prior_exposures_to_session_type 0 prior_exposures_to_image_set 2 prior_exposures_to_omissions 6 date_of_acquisition 2019-04-19 09:21:45.638000+00:00 equipment_name MESO.1 num_depths_per_area 4 ophys_experiment_id [854759890, 854759894, 854759896, 854759898, 8... num_targeted_structures 2 Name: 854060305, dtype: object
Each session consists of one or more 'experiments', in which each experiment is a single imaging plane
Each mesoscope session has up to 8 experiments associated with the session. We will load all sessions into a dictionary with the experiment IDs as the keys
The first time that this cell is run, the associated NWB files will be downloaded to your local data_storage_directory
. Subsequent runs of this cell will be faster since the data will already be cached locally.
experiments = {}
ophys_experiment_ids = session_table.loc[ophys_session_id]['ophys_experiment_id']
for ophys_experiment_id in ophys_experiment_ids:
experiments[ophys_experiment_id] = cache.get_behavior_ophys_experiment(ophys_experiment_id)
behavior_ophys_experiment_854759890.nwb: 100%|██████████| 232M/232M [00:08<00:00, 25.8MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded. return func(args[0], **pargs) behavior_ophys_experiment_854759894.nwb: 100%|██████████| 252M/252M [00:09<00:00, 25.7MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded. return func(args[0], **pargs) behavior_ophys_experiment_854759896.nwb: 100%|██████████| 232M/232M [00:07<00:00, 30.1MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded. return func(args[0], **pargs) behavior_ophys_experiment_854759898.nwb: 100%|██████████| 246M/246M [00:08<00:00, 28.7MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded. return func(args[0], **pargs) behavior_ophys_experiment_854759900.nwb: 100%|██████████| 243M/243M [00:08<00:00, 28.1MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded. return func(args[0], **pargs) behavior_ophys_experiment_854759903.nwb: 100%|██████████| 250M/250M [00:08<00:00, 30.0MMB/s] /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/hdmf/utils.py:668: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.7.0 is already loaded. return func(args[0], **pargs)
We can view the cell_specimen_table
for one experiment, which contains information about each identified cell in that experiment
experiment = experiments[ophys_experiment_ids[1]]
experiment.cell_specimen_table.head()
cell_roi_id | height | mask_image_plane | max_correction_down | max_correction_left | max_correction_right | max_correction_up | valid_roi | width | x | y | roi_mask | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
cell_specimen_id | ||||||||||||
1086557083 | 1080855636 | 17 | 0 | 5.0 | 7.0 | 4.0 | 7.0 | True | 19 | 173 | 384 | [[False, False, False, False, False, False, Fa... |
1086557639 | 1080855643 | 16 | 0 | 5.0 | 7.0 | 4.0 | 7.0 | True | 14 | 363 | 447 | [[False, False, False, False, False, False, Fa... |
1086559064 | 1080855660 | 19 | 1 | 5.0 | 7.0 | 4.0 | 7.0 | True | 17 | 24 | 221 | [[False, False, False, False, False, False, Fa... |
1086558114 | 1080855673 | 18 | 0 | 5.0 | 7.0 | 4.0 | 7.0 | True | 13 | 74 | 305 | [[False, False, False, False, False, False, Fa... |
1086558224 | 1080855678 | 19 | 0 | 5.0 | 7.0 | 4.0 | 7.0 | True | 18 | 478 | 284 | [[False, False, False, False, False, False, Fa... |
We can then visualize the max projection and one of the identified ROIs
fig, ax = plt.subplots(1, 2, figsize=(15, 8), sharex=True, sharey=True)
ax[0].imshow(experiment.max_projection, cmap='gray')
ax[0].set_title('max projection')
cell_specimen_id = experiment.cell_specimen_table.index[2]
ax[1].imshow(experiment.cell_specimen_table.loc[cell_specimen_id]['roi_mask'])
ax[1].set_title('ROI mask for cell_specimen_id = {}'.format(cell_specimen_id))
fig.show()
The cell below will load the neural data into memory in the pandas 'tidy' format by iterating over each of the 6 experiments and using some helpful tools from the brain_observatory_utilities
package that was imported above as ophys
.
It will also include a subset of metadata from ophys_experiment_table
to facilitate splitting by depth, structure (aka cortical area), cre line (aka cell class), etc.
Note that 'tidy' data means that each row represents only one observation. Observations are stacked vertically. Thus, the timestamps
columns will repeat for every cell in the dataset.
neural_data = []
for ophys_experiment_id in tqdm(experiments.keys()): #tqdm is a package that shows progress bars for items that are iterated over
this_experiment = experiments[ophys_experiment_id]
this_experiment_neural_data = ophys_formatting.build_tidy_cell_df(this_experiment)
# add some columns with metadata for the experiment
metadata_keys = [
'ophys_experiment_id',
'ophys_session_id',
'targeted_structure',
'imaging_depth',
'equipment_name',
'cre_line',
'mouse_id',
'sex',
]
for metadata_key in metadata_keys:
this_experiment_neural_data[metadata_key] = this_experiment.metadata[metadata_key]
# append the data for this experiment to a list
neural_data.append(this_experiment_neural_data)
# concatate the list of dataframes into a single dataframe
neural_data = pd.concat(neural_data)
100%|██████████| 6/6 [00:00<00:00, 6.68it/s]
We can then look at some attributes of the neural_data
dataframe we have created.
It is ~2.5 million rows long:
len(neural_data)
2561543
It is so long because has one row for each timestamp for each cell.
Below are the first 5 entries. Again, note that the tidy
format means that each row has only one observation, which represents a single GCaMP6 fluorescnce value for a single neuron.
neural_data.head()
timestamps | dff | events | filtered_events | cell_roi_id | cell_specimen_id | ophys_experiment_id | ophys_session_id | targeted_structure | imaging_depth | equipment_name | cre_line | mouse_id | sex | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 10.52216 | 0.400583 | 0.0 | 0.0 | 1080852071 | 1086550481 | 854759890 | 854060305 | VISp | 275 | MESO.1 | Sst-IRES-Cre | 440631 | M |
1 | 10.61538 | 0.126125 | 0.0 | 0.0 | 1080852071 | 1086550481 | 854759890 | 854060305 | VISp | 275 | MESO.1 | Sst-IRES-Cre | 440631 | M |
2 | 10.70860 | -0.083087 | 0.0 | 0.0 | 1080852071 | 1086550481 | 854759890 | 854060305 | VISp | 275 | MESO.1 | Sst-IRES-Cre | 440631 | M |
3 | 10.80182 | 0.158960 | 0.0 | 0.0 | 1080852071 | 1086550481 | 854759890 | 854060305 | VISp | 275 | MESO.1 | Sst-IRES-Cre | 440631 | M |
4 | 10.89504 | 0.301507 | 0.0 | 0.0 | 1080852071 | 1086550481 | 854759890 | 854060305 | VISp | 275 | MESO.1 | Sst-IRES-Cre | 440631 | M |
cell_roi_id
column contains unique roi ids for all cells in a given experiment, which do not repeat across ophys sessions.cell_specimen_id
column contains unique ids for cells that were matched across ophys sessions. Thus, a cell that was imaged in more than one session has multiple roi ids but one cell specimen id.We can get the unique Cell IDs in our dataset as follows:
cell_ids = neural_data['cell_specimen_id'].unique()
print('there are {} unique cells'.format(len(cell_ids)))
print('cell ids are: {}'.format(cell_ids))
there are 53 unique cells cell ids are: [1086550481 1086551114 1086551301 1086557083 1086557639 1086559064 1086558114 1086558224 1086558510 1086559206 1086557304 1086557208 1086560061 1086559681 1086559885 1086559968 1086557470 1086547796 1086547993 1086548118 1086554566 1086556653 1086558574 1086552296 1086558071 1086556532 1086555222 1086558701 1086557434 1086556317 1086555835 1086549726 1086553836 1086551540 1086551151 1086550544 1086552709 1086553271 1086553602 1086555553 1086548072 1086553899 1086547630 1086549303 1086549491 1086549813 1086549949 1086548658 1086548969 1086551457 1086551645 1086550990 1086551209]
If we wanted to get the timeseries for one cell, we could query the neural_data
dataframe. For example, to get the full timeseries for the cell with cell_specimen_id = 1086557208
:
single_cell_timeseries = neural_data.query('cell_specimen_id == 1086557208')
single_cell_timeseries.head()
timestamps | dff | events | filtered_events | cell_roi_id | cell_specimen_id | ophys_experiment_id | ophys_session_id | targeted_structure | imaging_depth | equipment_name | cre_line | mouse_id | sex | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 10.52216 | 0.218961 | 0.0 | 0.0 | 1080855724 | 1086557208 | 854759894 | 854060305 | VISp | 179 | MESO.1 | Sst-IRES-Cre | 440631 | M |
1 | 10.61538 | 0.232865 | 0.0 | 0.0 | 1080855724 | 1086557208 | 854759894 | 854060305 | VISp | 179 | MESO.1 | Sst-IRES-Cre | 440631 | M |
2 | 10.70860 | -0.050186 | 0.0 | 0.0 | 1080855724 | 1086557208 | 854759894 | 854060305 | VISp | 179 | MESO.1 | Sst-IRES-Cre | 440631 | M |
3 | 10.80182 | 0.239468 | 0.0 | 0.0 | 1080855724 | 1086557208 | 854759894 | 854060305 | VISp | 179 | MESO.1 | Sst-IRES-Cre | 440631 | M |
4 | 10.89504 | 0.226356 | 0.0 | 0.0 | 1080855724 | 1086557208 | 854759894 | 854060305 | VISp | 179 | MESO.1 | Sst-IRES-Cre | 440631 | M |
Each cell has three types of traces:
dff
column is the Calcium fluorescence signal, normalized to background fluorescence.events
column is deconvolved events from dff trace, which approximates neural firing rate and removes the slow decay of the Calcium signal (for more details, you can read EVENT DETECTION section in Visual Behavior whitepaper).filtered_events
column is events smoothed with a half-gaussian kernel.We can then plot DeltaF/F for this cell for the full experiment as follows:
fig, ax = plt.subplots(figsize=(15,5))
single_cell_timeseries.plot(
x = 'timestamps',
y = 'dff',
ax = ax
)
fig.show()
The stimulus table is shared across all experiments (imaging planes) in a session. We can therefore use the stimulus table for just one experiment.
We are going to drop the image_set
column because it is not informative for our purposes. We can then view the first 10 rows of the stimulus table.
stimulus_table = experiments[ophys_experiment_ids[0]].stimulus_presentations
stimulus_table.head(10)
stimulus_block | stimulus_block_name | image_index | image_name | movie_frame_index | duration | start_time | end_time | start_frame | end_frame | is_change | is_image_novel | omitted | movie_repeat | flashes_since_change | trials_id | stimulus_name | is_sham_change | active | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
stimulus_presentations_id | |||||||||||||||||||
0 | 0 | initial_gray_screen_5min | -99 | NaN | -99 | 310.569786 | 0.000000 | 310.569786 | 0 | 17985 | False | <NA> | <NA> | -99 | 0 | -99 | spontaneous | False | False |
1 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250210 | 310.569786 | 310.819996 | 17985 | 18000 | False | False | False | -99 | 1 | 0 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
2 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250200 | 311.320396 | 311.570596 | 18030 | 18045 | False | False | False | -99 | 2 | 0 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
3 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250170 | 312.071016 | 312.321186 | 18075 | 18090 | False | False | False | -99 | 3 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
4 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250190 | 312.821616 | 313.071806 | 18120 | 18135 | False | False | False | -99 | 4 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
5 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250210 | 313.572196 | 313.822406 | 18165 | 18180 | False | False | False | -99 | 5 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
6 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250230 | 314.322816 | 314.573046 | 18210 | 18225 | False | False | False | -99 | 6 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
7 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250210 | 315.073456 | 315.323666 | 18255 | 18270 | False | False | False | -99 | 7 | 2 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
8 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250150 | 315.824126 | 316.074276 | 18300 | 18315 | False | False | False | -99 | 8 | 2 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
9 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.250220 | 316.574676 | 316.824896 | 18345 | 18360 | False | False | False | -99 | 9 | 2 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
This table provides helpful information like image name, start, duration and stop of image presentation, and whether the image was omitted. stimulus_block
and stimulus_block_name
indicate the type of stimulus mice were presented at a given point in a session. To select active change detection behavior, first we need to filter the table for change_detection_behavior
or 1
block. Note that sessions may have different number of stimulus blocks, thus change_detection_behavior
may be associated with either 0 or 1 in stimulus_block
column.
stimulus_table.stimulus_block_name.unique()
array(['initial_gray_screen_5min', 'change_detection_behavior', 'post_behavior_gray_screen_5min', 'natural_movie_one'], dtype=object)
stimulus_table = stimulus_table[stimulus_table.stimulus_block_name=='change_detection_behavior']
stimulus_table.reset_index(drop=True, inplace=True) # resetting index starts df at stimulus 0
# give index a name
stimulus_table.index.name = 'stimulus_presentations_id'
stimulus_table.head(5)
stimulus_block | stimulus_block_name | image_index | image_name | movie_frame_index | duration | start_time | end_time | start_frame | end_frame | is_change | is_image_novel | omitted | movie_repeat | flashes_since_change | trials_id | stimulus_name | is_sham_change | active | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
stimulus_presentations_id | |||||||||||||||||||
0 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25021 | 310.569786 | 310.819996 | 17985 | 18000 | False | False | False | -99 | 1 | 0 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
1 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25020 | 311.320396 | 311.570596 | 18030 | 18045 | False | False | False | -99 | 2 | 0 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
2 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25017 | 312.071016 | 312.321186 | 18075 | 18090 | False | False | False | -99 | 3 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
3 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25019 | 312.821616 | 313.071806 | 18120 | 18135 | False | False | False | -99 | 4 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
4 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25021 | 313.572196 | 313.822406 | 18165 | 18180 | False | False | False | -99 | 5 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
stimulus_templates
attribute¶Note that the unwarped
column contains the image before the application of a spherical warp. All of the pixels labeled 'NaN' will be off-screen (not visible to the mouse) after the warp is applied.
All experiments in a given session will share the same stimulus_templates
experiment = experiments[ophys_experiment_ids[0]]
experiment.stimulus_templates
unwarped | warped | |
---|---|---|
image_name | ||
im000 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[122, 122, 123, 125, 126, 127, 128, 129, 130,... |
im106 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[108, 109, 106, 103, 102, 104, 107, 112, 117,... |
im075 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[120, 121, 121, 121, 122, 123, 123, 122, 121,... |
im073 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[120, 120, 118, 116, 116, 119, 121, 120, 117,... |
im045 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[10, 13, 6, 0, 0, 8, 15, 13, 6, 2, 4, 9, 12, ... |
im054 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[124, 125, 127, 130, 133, 134, 136, 138, 140,... |
im031 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[233, 234, 244, 253, 253, 244, 237, 239, 246,... |
im035 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | [[178, 181, 189, 198, 200, 198, 196, 199, 205,... |
fig, ax = plt.subplots(2, 4, figsize=(20, 8), sharex=True, sharey=True)
for ii, image_name in enumerate(experiment.stimulus_templates.index):
ax.flatten()[ii].imshow(experiment.stimulus_templates.loc[image_name]['unwarped'], cmap='gray')
ax.flatten()[ii].set_title(image_name)
fig.tight_layout()
fig.show()
This represents what was actually on the screen during the session
fig, ax = plt.subplots(2, 4, figsize=(20, 8), sharex=True, sharey=True)
for ii, image_name in enumerate(experiment.stimulus_templates.index):
ax.flatten()[ii].imshow(experiment.stimulus_templates.loc[image_name]['warped'], cmap='gray')
ax.flatten()[ii].set_title(image_name)
fig.tight_layout()
fig.show()
An important feature of the task is that stimuli are shown at a very regular cadence (250 ms on, 500 ms off), but stimuli are randomly omitted with a probability of ~5%. These unexpected and random stimulus omissions could be perceived as an expectation violation by the mouse.
Omitted stimuli are denoted in the stimulus_table
by the omitted
column. True
means that the stimulus that would have been shown at that time was actually omitted (and was replaced by an extended gray screen between stimuli).
We can look at the first 10 examples of omitted stimuli as follows. Note that each 'omitted' stimulus still has a 'start_time' and a 'stop_time' associated with it. This actually represents the time that a stimulus would have been shown, had it not been omitted.
Stimulus omissions are also indicated in the image_name
column by the string omitted
stimulus_table.query('omitted', engine='python').head(10)
stimulus_block | stimulus_block_name | image_index | image_name | movie_frame_index | duration | start_time | end_time | start_frame | end_frame | is_change | is_image_novel | omitted | movie_repeat | flashes_since_change | trials_id | stimulus_name | is_sham_change | active | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
stimulus_presentations_id | |||||||||||||||||||
61 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 356.373816 | 356.623816 | 20731 | 20746 | False | <NA> | True | -99 | 2 | 7 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
105 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 389.400806 | 389.650806 | 22711 | 22726 | False | <NA> | True | -99 | 0 | 12 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
113 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 395.405686 | 395.655686 | 23071 | 23086 | False | <NA> | True | -99 | 7 | 13 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
128 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 406.664926 | 406.914926 | 23746 | 23761 | False | <NA> | True | -99 | 21 | 15 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
143 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 417.940786 | 418.190786 | 24422 | 24437 | False | <NA> | True | -99 | 35 | 18 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
172 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 439.708536 | 439.958536 | 25727 | 25742 | False | <NA> | True | -99 | 8 | 21 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
174 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 441.209796 | 441.459796 | 25817 | 25832 | False | <NA> | True | -99 | 9 | 21 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
243 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 493.018766 | 493.268766 | 28923 | 28938 | False | <NA> | True | -99 | 0 | 28 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
254 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 501.275466 | 501.525466 | 29418 | 29433 | False | <NA> | True | -99 | 0 | 29 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
267 | 1 | change_detection_behavior | 8 | omitted | -99 | 0.25 | 511.033476 | 511.283476 | 30003 | 30018 | False | <NA> | True | -99 | 12 | 32 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
If we want to see how a given cell responds when regularly flashed stimuli are omitted, we can calculate the response around each of the stimulus omissions. The brain_observatory_utilities
package has a convenience function to do this, in the module we imported as utilities
. We give the function:
The function will return a new dataframe with the response for the given cell, aligned to each of the events.
cell_id = cell_ids[11]
etr = utilities.event_triggered_response(
data=neural_data.query('cell_specimen_id == @cell_id'),
t='timestamps',
y='dff',
event_times=stimulus_table.query('omitted', engine='python')['start_time'],
t_before=3,
t_after=3,
output_sampling_rate=50,
)
etr
time | dff | event_number | stimulus_presentations_id | event_time | |
---|---|---|---|---|---|
0 | -3.00 | 0.043293 | 0 | 61 | 356.373816 |
1 | -2.98 | 0.036577 | 0 | 61 | 356.373816 |
2 | -2.96 | 0.021164 | 0 | 61 | 356.373816 |
3 | -2.94 | 0.005752 | 0 | 61 | 356.373816 |
4 | -2.92 | -0.009661 | 0 | 61 | 356.373816 |
... | ... | ... | ... | ... | ... |
55680 | 2.92 | 0.001923 | 184 | 4796 | 3911.326506 |
55681 | 2.94 | 0.001321 | 184 | 4796 | 3911.326506 |
55682 | 2.96 | 0.001321 | 184 | 4796 | 3911.326506 |
55683 | 2.98 | 0.001321 | 184 | 4796 | 3911.326506 |
55684 | 3.00 | 0.001321 | 184 | 4796 | 3911.326506 |
55685 rows × 5 columns
We can see that the output has columns for
time
- this is our new timebase relative to the events. In this case, it ranges from -3 to 3dff
- this is the deltaF/F value surrounding each event, interpolated onto the new timebase. If, when calling the event_triggered_response
function we had passed y = 'events'
, this column would be events instead of dff.event_number
- this is an integer representing the count of each event. In this example, there were 185 omissions, so they are numbered from 0 to 184event_time
- this is the time of each eventThe docstring for the event_triggered_response
function can be viewed as follows:
help(utilities.event_triggered_response)
Help on function event_triggered_response in module brain_observatory_utilities.utilities.general_utilities: event_triggered_response(data, t, y, event_times, t_start=None, t_end=None, t_before=None, t_after=None, output_sampling_rate=None, include_endpoint=True, output_format='tidy', interpolate=True) Slices a timeseries relative to a given set of event times to build an event-triggered response. For example, If we have data such as a measurement of neural activity over time and specific events in time that we want to align the neural activity to, this function will extract segments of the neural timeseries in a specified time window around each event. The times of the events need not align with the measured times of the neural data. Relative times will be calculated by linear interpolation. Parameters: ----------- data: Pandas.DataFrame Input dataframe in tidy format Each row should be one observation Must contains columns representing `t` and `y` (see below) t : string Name of column in data to use as time data y : string Name of column to use as y data event_times: Panda.Series, numpy array or list of floats Times of events of interest. If pd.Series, the original index and index name will be preserved in the output Values in column specified by `y` will be sliced and interpolated relative to these times t_start : float start time relative to each event for desired time window e.g.: t_start = -1 would start the window 1 second before each t_start = 1 would start the window 1 second after each event Note: cannot pass both t_start and t_before t_before : float time before each of event of interest to include in each slice e.g.: t_before = 1 would start the window 1 second before each event t_before = -1 would start the window 1 second after each event Note: cannot pass both t_start and t_before t_end : float end time relative to each event for desired time window e.g.: t_end = 1 would end the window 1 second after each event t_end = -1 would end the window 1 second before each event Note: cannot pass both t_end and t_after t_after : float time after each event of interest to include in each slice e.g.: t_after = 1 would start the window 1 second after each event t_after = -1 would start the window 1 second before each event Note: cannot pass both t_end and t_after output_sampling_rate : float Desired sampling of output. Input data will be interpolated to this sampling rate if interpolate = True (default). # NOQA E501 If passing interpolate = False, the sampling rate of the input timeseries will # NOQA E501 be used and output_sampling_rate should not be specified. include_endpoint : Boolean Passed to np.linspace to calculate relative time If True, stop is the last sample. Otherwise, it is not included. Default is True output_format : string 'wide' or 'tidy' (default = 'tidy') if 'tidy' One column representing time One column representing event_number One column representing event_time One row per observation (# rows = len(time) x len(event_times)) if 'wide', output format will be: time as indices One row per interpolated timepoint One column per event, with column names titled event_{EVENT NUMBER}_t={EVENT TIME} interpolate : Boolean if True (default), interpolates each response onto a common timebase if False, shifts each response to align indices to a common timebase Returns: -------- Pandas.DataFrame See description in `output_format` section above Examples: --------- An example use case, recover a sinousoid from noise: First, define a time vector >>> t = np.arange(-10,110,0.001) Now build a dataframe with one column for time, and another column that is a noise-corrupted sinuosoid with period of 1 >>> data = pd.DataFrame({ 'time': t, 'noisy_sinusoid': np.sin(2*np.pi*t) + np.random.randn(len(t))*3 }) Now use the event_triggered_response function to get a tidy dataframe of the signal around every event Events will simply be generated as every 1 second interval starting at 0, since our period here is 1 >>> etr = event_triggered_response( data, x = 'time', y = 'noisy_sinusoid', event_times = np.arange(100), t_start = -1, t_end = 1, output_sampling_rate = 100 ) Then use seaborn to view the result We're able to recover the sinusoid through averaging >>> import matplotlib.pyplot as plt >>> import seaborn as sns >>> fig, ax = plt.subplots() >>> sns.lineplot( data = etr, x='time', y='noisy_sinusoid', ax=ax )
The output format of the event_triggered_response
function is designed to plug directly into Seaborn's lineplot
plotting function. We can then view the mean response to omitted stimuli with 95% confidence intervals very easily:
sns.lineplot(
data=etr,
x='time',
y='dff',
n_boot=500
)
<Axes: xlabel='time', ylabel='dff'>
Note that the regular, image-driven responses with a 750 ms inter-stimulus interval are visible everywhere except at t=0, which is when the unexpectedly omitted stimulus occurred.
If we make a wrapper function that combines the process of calculating and plotting the event triggered response, it can be called in a single line below. By having event_query
input variable, we can use this function to plot responses to any event of interest (omisisons, changes, hits/misses, specific images, etc)
def make_event_triggered_plot(df, x, y, event_query, ax, t_before=3, t_after=3):
etr = utilities.event_triggered_response(
data=df,
t='timestamps',
y=y,
event_times=stimulus_table.query(event_query, engine='python')['start_time'],
t_before=t_before,
t_after=t_before,
output_sampling_rate=50,
)
sns.lineplot(
data=etr,
x=x,
y=y,
n_boot=500,
ax=ax
)
Now plot the omission triggered response for the same cell using filtered events (these events extracted from the deltaF/F timeseries using an event extraction algorithm, then smoothed with a half-gaussian kernel) instead of dff.
cell_id = cell_ids[11]
fig, ax = plt.subplots()
make_event_triggered_plot(
df=neural_data.query('cell_specimen_id == @cell_id'),
x='time',
y='filtered_events',
event_query='omitted',
ax=ax
)
fig.show()
We can then iterate over 10 randomly chosen cells and plot their activity during omissions.
np.random.seed(0)
fig, ax = plt.subplots()
for cell_id in tqdm(np.random.choice(cell_ids, size=10, replace=False)):
make_event_triggered_plot(
df=neural_data.query('cell_specimen_id == @cell_id'),
x='time',
y='dff',
event_query='omitted',
ax=ax
)
fig.show()
100%|██████████| 10/10 [00:29<00:00, 2.96s/it]
Interestingly, not all SST cells in this session do the same thing!
By iterating over experiment IDs, we can also calculate the mean response for each of the 6 imaging planes. Do Sst cells in different visual areas respond to omissions in a distinct way?
We will first use a Pandas groupby
and mean
operations to get the mean timeseries for each cell in that imaging plane:
mean_dff_by_experiment = (
neural_data
.groupby(['ophys_experiment_id','timestamps'])['dff']
.mean()
.reset_index()
)
mean_dff_by_experiment.head()
ophys_experiment_id | timestamps | dff | |
---|---|---|---|
0 | 854759890 | 10.52216 | 0.387612 |
1 | 854759890 | 10.61538 | 0.203569 |
2 | 854759890 | 10.70860 | 0.035257 |
3 | 854759890 | 10.80182 | 0.357586 |
4 | 854759890 | 10.89504 | 0.146397 |
We can then iterate over our 6 experiment IDs and use our make_event_triggered_plot
wrapper function to calculate and plot the omission triggered response for that imaging plane:
# set up a new figure and axis
fig, ax = plt.subplots()
# make an empty list that we will fill with strings for the legend
legend_text = []
# iterate over every `ophys_experiment_id`
for ophys_experiment_id in tqdm(ophys_experiment_ids):
make_event_triggered_plot(
df=mean_dff_by_experiment.query('ophys_experiment_id == @ophys_experiment_id'),
x='time',
y='dff',
event_query='omitted',
ax=ax
)
# get some metadata to add to the legend
this_exp = neural_data.query('ophys_experiment_id == @ophys_experiment_id')
structure = this_exp['targeted_structure'].iloc[0]
depth = this_exp['imaging_depth'].iloc[0]
# append a string to our list of legend text
legend_text.append('structure = {}\ndepth = {} um'.format(structure, depth))
# Put the legend out of the figure
plt.legend(legend_text, bbox_to_anchor=(1.05, 1))
fig.show()
100%|██████████| 6/6 [00:17<00:00, 2.99s/it]
There are clearly some large differences in the way that Sst cells respond to these unexpected stimulus omissions by area.
This example could be extended to include cells from the other two cre-lines in the dataset: The VIP-Cre line which labels VIP+ inhibitory interneurons and the Slc17a7 line, which is a pan-excitatory line.
session_table['cre_line'].unique()
array(['Sst-IRES-Cre', 'Vip-IRES-Cre', 'Slc17a7-IRES2-Cre'], dtype=object)
In addition, responses to different stimuli could be explored, along with responses relative to other behavioral measures, such as licking.
For a full description of the dataset and all available data streams, see the Visual Behavior Project Description at: https://portal.brain-map.org/explore/circuits/visual-behavior-2p
What if we wanted to use scikit-learn for a decoding or clustering analysis? We'd need to get the data into a standard format for scikit learn, which is often a feature matrix (X
) and a vector of labels (y
).
Instead of just omissions, let's now look at the responses to each of the stimuli in this session, which consists of 8 unique images, plus the omitted stimuli (which we characterize as a unique stimulus type). First, we will calculate an event triggered response to each stimulus start time in the stimulus table.
full_etr_l = []
# iterate over each unique cell
for cell_specimen_id in tqdm(neural_data['cell_specimen_id'].unique()):
# calculate the event triggered response for this cell to every stimulus
full_etr_this_cell = utilities.event_triggered_response(
neural_data.query('cell_specimen_id == @cell_specimen_id'),
t='timestamps',
y='dff',
event_times=stimulus_table['start_time'],
t_before=0,
t_after=0.75,
output_sampling_rate=30
)
# add a column identifying the cell_specimen_id
full_etr_this_cell['cell_specimen_id'] = cell_specimen_id
# append to our list
full_etr_l.append(full_etr_this_cell)
# concatenate our list of dataframes into a single dataframe
full_etr = pd.concat(full_etr_l)
# cast these numeric columns to int and float, respectively
full_etr['event_number'] = full_etr['event_number'].astype(int)
full_etr['event_time'] = full_etr['event_number'].astype(float)
# rename 'event_number' as
# full_etr.rename(columns={'event_number': 'stimulus_presentations_id'}, inplace=True)
100%|██████████| 53/53 [00:55<00:00, 1.05s/it]
One way to construct a feature matrix might be to build it such that dimensions are trials x cells
. Thus:
To do so, let's construct another intermediate dataframe called average_responses
that contains the average response of each cell (in the 750 ms window we've selected above) to each image presentation. We'll do this using a Pandas groupby to group by cell_specimen_id
and stimulus_presentations_id
(aka trial).
We're also going to merge in our stimulus metadata.
full_etr['event_number'] = full_etr['event_number'].astype(int)
full_etr['event_time'] = full_etr['event_number'].astype(float)
average_responses = full_etr.groupby(['cell_specimen_id', 'stimulus_presentations_id'])[['dff']].mean().reset_index().merge(
stimulus_table,
on='stimulus_presentations_id',
how='left'
)
average_responses
cell_specimen_id | stimulus_presentations_id | dff | stimulus_block | stimulus_block_name | image_index | image_name | movie_frame_index | duration | start_time | end_time | start_frame | end_frame | is_change | is_image_novel | omitted | movie_repeat | flashes_since_change | trials_id | stimulus_name | is_sham_change | active | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1086547630 | 0 | -0.261392 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25021 | 310.569786 | 310.819996 | 17985 | 18000 | False | False | False | -99 | 1 | 0 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
1 | 1086547630 | 1 | 0.354413 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25020 | 311.320396 | 311.570596 | 18030 | 18045 | False | False | False | -99 | 2 | 0 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
2 | 1086547630 | 2 | 0.374132 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25017 | 312.071016 | 312.321186 | 18075 | 18090 | False | False | False | -99 | 3 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
3 | 1086547630 | 3 | 0.171823 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25019 | 312.821616 | 313.071806 | 18120 | 18135 | False | False | False | -99 | 4 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
4 | 1086547630 | 4 | -0.065775 | 1 | change_detection_behavior | 0 | im000 | -99 | 0.25021 | 313.572196 | 313.822406 | 18165 | 18180 | False | False | False | -99 | 5 | 1 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
254607 | 1086560061 | 4799 | -0.007224 | 1 | change_detection_behavior | 4 | im045 | -99 | 0.25022 | 3913.578346 | 3913.828566 | 233989 | 234004 | False | False | False | -99 | 1 | 430 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
254608 | 1086560061 | 4800 | -0.011908 | 1 | change_detection_behavior | 4 | im045 | -99 | 0.25018 | 3914.328936 | 3914.579116 | 234034 | 234049 | False | False | False | -99 | 2 | 430 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
254609 | 1086560061 | 4801 | 0.013366 | 1 | change_detection_behavior | 4 | im045 | -99 | 0.25019 | 3915.079546 | 3915.329736 | 234079 | 234094 | False | False | False | -99 | 3 | 430 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
254610 | 1086560061 | 4802 | -0.022811 | 1 | change_detection_behavior | 4 | im045 | -99 | 0.25023 | 3915.830156 | 3916.080386 | 234124 | 234139 | False | False | False | -99 | 4 | 430 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
254611 | 1086560061 | 4803 | 0.029136 | 1 | change_detection_behavior | 4 | im045 | -99 | 0.25025 | 3916.580756 | 3916.831006 | 234169 | 234184 | False | False | False | -99 | 5 | 430 | Natural_Images_Lum_Matched_set_ophys_6_2017 | False | True |
254612 rows × 22 columns
Now we can construct a dataframe called features_and_labels
that will contain one row per trial, one column per cell, plus columns with the image_index and image_name
features_and_labels = average_responses.pivot(
index='stimulus_presentations_id',
columns='cell_specimen_id',
values='dff'
).merge(
stimulus_table[['image_index','image_name']],
on='stimulus_presentations_id',
how='left'
)
features_and_labels.sample(10)
1086547630 | 1086547796 | 1086547993 | 1086548072 | 1086548118 | 1086548658 | 1086548969 | 1086549303 | 1086549491 | 1086549726 | 1086549813 | 1086549949 | 1086550481 | 1086550544 | 1086550990 | 1086551114 | 1086551151 | 1086551209 | 1086551301 | 1086551457 | 1086551540 | 1086551645 | 1086552296 | 1086552709 | 1086553271 | 1086553602 | 1086553836 | 1086553899 | 1086554566 | 1086555222 | 1086555553 | 1086555835 | 1086556317 | 1086556532 | 1086556653 | 1086557083 | 1086557208 | 1086557304 | 1086557434 | 1086557470 | 1086557639 | 1086558071 | 1086558114 | 1086558224 | 1086558510 | 1086558574 | 1086558701 | 1086559064 | 1086559206 | 1086559681 | 1086559885 | 1086559968 | 1086560061 | image_index | image_name | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
stimulus_presentations_id | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
2802 | 0.188744 | -0.008572 | -0.011187 | 0.068998 | 0.004372 | 0.040266 | 0.000542 | 0.107748 | 0.027901 | 0.017652 | -0.008583 | -0.001325 | 0.221544 | -0.003638 | -0.011980 | -0.032854 | 0.025846 | 0.001994 | 0.114461 | 0.055439 | 6.771504 | 0.000572 | 0.028877 | 1.897729 | 0.098143 | 0.578993 | -0.010733 | 0.018567 | -0.029723 | 0.035030 | 0.067603 | -0.011919 | -0.004589 | -0.044386 | 0.134312 | 0.001550 | 0.035612 | 0.003460 | 0.033602 | 0.155450 | 0.029361 | 0.007124 | -0.021153 | 0.013226 | 0.096590 | 0.010745 | 0.081815 | 0.342996 | -0.034027 | 0.040853 | -0.025462 | -0.015748 | 0.045814 | 4 | im045 |
873 | 0.423262 | 0.195502 | 0.131998 | -0.019266 | 0.038702 | 0.034431 | 0.002327 | -0.057758 | 0.037060 | 0.006215 | 0.091679 | 0.025424 | 0.053717 | 0.020017 | 0.047703 | 0.334743 | 0.044308 | 0.033760 | 0.012921 | 0.029380 | 0.918336 | -0.038116 | -0.028756 | -0.025687 | 0.064217 | -0.004619 | 0.690495 | -0.042575 | -0.008669 | -0.019768 | -0.012421 | 0.012025 | 0.069876 | 0.026988 | -0.100992 | -0.002765 | -0.029208 | 0.025067 | -0.074094 | -0.036319 | -0.001228 | 0.008422 | 0.059426 | -0.050069 | 0.047464 | 0.022979 | 0.056264 | -0.016134 | 0.027653 | 0.008519 | -0.076631 | 0.019184 | -0.046165 | 2 | im075 |
1428 | -0.031489 | -0.021622 | -0.084964 | 0.012929 | -0.020635 | 0.020563 | -0.093702 | -0.077242 | 0.363789 | -0.040719 | -0.111365 | -0.009501 | 0.002781 | -0.047153 | -0.061137 | -0.047434 | -0.045629 | -0.031159 | -0.196107 | 0.014383 | -1.046354 | -0.061302 | -0.015688 | -0.139933 | -0.075276 | -0.023855 | 0.049805 | 0.020142 | 0.018077 | 0.044969 | -0.046662 | -0.028479 | -0.005956 | -0.002147 | -0.286239 | -0.103877 | -0.082403 | -0.030807 | -0.009166 | -0.029450 | -0.039558 | -0.045139 | -0.049476 | -0.028388 | -0.364373 | -0.038763 | -0.004181 | -0.181343 | -0.021466 | -0.117293 | -0.006498 | -0.168274 | 0.006674 | 8 | omitted |
502 | 0.352991 | 0.002643 | -0.072307 | 0.006717 | 0.034363 | -0.031389 | 0.041112 | 0.008856 | -0.001766 | -0.004686 | 0.041457 | 0.012696 | 0.012373 | -0.012404 | 0.020623 | 1.669691 | -0.025821 | 0.007340 | 0.009833 | 0.019995 | -1.255134 | -0.025441 | 0.005716 | 0.061934 | 0.010198 | -0.839118 | -0.419708 | -0.010616 | -0.001557 | -0.016340 | 0.021572 | 0.008620 | 0.020470 | 0.023473 | -0.197800 | -0.037609 | 0.014506 | -0.024180 | -0.081081 | 0.015720 | 0.018417 | 0.018045 | 0.053760 | -0.003124 | -0.163397 | -0.008594 | 0.014436 | -0.024876 | 0.062666 | -0.019565 | 0.001800 | -0.030068 | -0.001519 | 1 | im106 |
831 | -0.112112 | 0.027243 | 0.119985 | -0.082399 | 0.026680 | 0.031627 | -0.010097 | -0.059420 | -0.013806 | 0.011196 | 0.076926 | 0.006937 | 0.233008 | -0.005911 | 0.006878 | -0.024285 | 0.017898 | 0.009467 | 0.010462 | 0.024603 | 1.790705 | -0.041758 | 0.014132 | -0.024213 | -0.004381 | 1.972306 | -0.030419 | 0.028825 | 0.004494 | -0.033373 | 0.063855 | -0.022910 | 0.035961 | 0.016078 | -0.786663 | -0.027895 | -0.023252 | -0.041180 | 0.625850 | 0.005769 | -0.042773 | 0.011269 | 0.014166 | 0.012780 | 0.162466 | 0.062172 | 0.115084 | 0.023663 | 0.019582 | -0.012661 | -0.043740 | 0.031887 | -0.042232 | 6 | im031 |
142 | -0.213124 | -0.011856 | -0.066075 | 0.040839 | 0.020031 | 0.030649 | -0.007731 | -0.003524 | 0.034375 | 0.007953 | -0.005359 | 0.023818 | 0.001401 | -0.008858 | 0.015069 | 0.004903 | 0.004793 | -0.007019 | 0.009715 | -0.018836 | -0.919745 | 0.017516 | 0.032806 | 0.038849 | 0.013616 | -1.090188 | -0.153180 | -0.029665 | -0.079135 | 0.017541 | -0.037542 | -0.024838 | 0.079564 | -0.013713 | -0.594666 | 0.114349 | 0.939516 | 0.043889 | 0.026208 | 0.020728 | 0.051941 | -0.002208 | 0.127695 | 0.015951 | 1.721376 | 0.005044 | -0.001263 | -0.073136 | -0.000326 | 0.096125 | 0.064020 | -0.069338 | 0.202951 | 0 | im000 |
872 | -0.365919 | 0.128934 | 0.126414 | -0.016161 | -0.000917 | -0.016184 | 0.180447 | 0.061008 | -0.041748 | 0.003082 | -0.080349 | 0.000304 | -0.065836 | -0.069230 | -0.021769 | 2.336083 | 0.029246 | -0.055579 | 0.035007 | -0.013510 | -0.394122 | 0.018768 | -0.016827 | -0.027958 | -0.016966 | -0.219451 | -0.524029 | 0.079444 | 0.087610 | -0.014453 | 0.032903 | -0.005947 | -0.041918 | 0.038773 | 0.551123 | 0.020741 | 0.022598 | 0.001982 | 0.009395 | 0.040839 | 0.004418 | 0.007271 | 0.021044 | 0.010126 | 0.098769 | 0.012831 | 0.030229 | 0.020644 | 0.052160 | -0.005155 | 0.027164 | -0.020613 | 0.022814 | 1 | im106 |
3561 | -0.265613 | -0.023161 | -0.047186 | 0.007225 | -0.004948 | -0.013870 | -0.023019 | -0.175991 | 0.316958 | -0.002102 | -0.143713 | -0.030722 | -0.016735 | -0.026546 | 0.006684 | 0.017497 | 0.026821 | -0.029001 | -0.027553 | -0.045317 | -1.104999 | 0.170058 | -0.000773 | 0.021314 | -0.013961 | -1.749390 | 0.059442 | 0.018309 | 0.035344 | -0.032749 | -0.068363 | -0.023550 | -0.028478 | 0.007259 | 0.101109 | 0.014096 | -0.050271 | 0.006265 | 0.097412 | -0.115290 | -0.031813 | -0.029665 | -0.019010 | -0.001997 | -0.061160 | 0.027799 | -0.057226 | -0.021368 | -0.002826 | 0.006638 | 0.023953 | 0.065672 | -0.011448 | 7 | im035 |
1894 | 0.044039 | -0.008075 | -0.048363 | 0.012422 | 0.041968 | 0.036341 | -0.012176 | 0.021193 | -0.019818 | -0.000130 | 0.004117 | -0.029039 | -0.025925 | 0.014631 | -0.030741 | -0.043443 | -0.015275 | -0.018793 | 0.007822 | -0.007793 | 0.034497 | 0.017451 | -0.008299 | 0.022712 | 0.026609 | -0.152381 | -0.005640 | 0.001168 | -0.029618 | 0.048389 | 0.031650 | 0.003746 | -0.050091 | -0.000465 | 0.097340 | 0.051821 | 0.305809 | -0.002264 | 0.052070 | 0.028950 | 0.044793 | 0.024830 | 0.078653 | -0.029171 | 0.138318 | -0.028110 | -0.034970 | -0.027078 | 0.003079 | 0.122274 | 0.011370 | 0.035782 | -0.008515 | 0 | im000 |
4616 | 0.565318 | -0.023920 | -0.008095 | 0.089502 | 0.007752 | 0.021257 | 0.016870 | 0.033922 | -0.050130 | 0.012862 | -0.028429 | 0.018301 | 0.476721 | 0.014091 | 0.074191 | -0.020188 | -0.037041 | 0.126370 | 0.347974 | 0.014032 | 3.654675 | -0.057374 | -0.002822 | 1.853332 | 0.246524 | 4.660115 | 0.118221 | 0.019667 | -0.016767 | 0.053885 | 0.154902 | 0.040081 | -0.031475 | -0.034351 | -0.385169 | 0.000795 | 0.004548 | -0.032679 | 0.266178 | -0.108577 | 0.055834 | 0.028215 | 0.025779 | 0.017060 | 0.033552 | 0.256883 | 0.019094 | 0.011179 | 0.021042 | 0.020139 | -0.037456 | -0.012039 | 0.025785 | 4 | im045 |
The X matrix can be extracted by getting the columns associated with the cell_specimen_ids
X = features_and_labels[cell_ids]
X.sample(10)
1086550481 | 1086551114 | 1086551301 | 1086557083 | 1086557639 | 1086559064 | 1086558114 | 1086558224 | 1086558510 | 1086559206 | 1086557304 | 1086557208 | 1086560061 | 1086559681 | 1086559885 | 1086559968 | 1086557470 | 1086547796 | 1086547993 | 1086548118 | 1086554566 | 1086556653 | 1086558574 | 1086552296 | 1086558071 | 1086556532 | 1086555222 | 1086558701 | 1086557434 | 1086556317 | 1086555835 | 1086549726 | 1086553836 | 1086551540 | 1086551151 | 1086550544 | 1086552709 | 1086553271 | 1086553602 | 1086555553 | 1086548072 | 1086553899 | 1086547630 | 1086549303 | 1086549491 | 1086549813 | 1086549949 | 1086548658 | 1086548969 | 1086551457 | 1086551645 | 1086550990 | 1086551209 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
stimulus_presentations_id | |||||||||||||||||||||||||||||||||||||||||||||||||||||
4189 | 0.089903 | 0.024906 | -0.046085 | -0.034460 | 0.002261 | -0.020595 | 0.071774 | -0.002502 | -0.020789 | 0.001684 | 0.008464 | -0.019575 | 0.049000 | -0.035157 | -0.003666 | -0.015179 | -0.107282 | -0.024429 | 0.006634 | -0.001652 | 0.056245 | 1.300125 | -0.009489 | 0.008088 | 0.015932 | 0.010329 | 0.056795 | 0.006799 | -0.013974 | -0.007446 | 0.030644 | -0.004067 | 0.062614 | 5.642910 | 0.004174 | 0.001658 | 0.045159 | 0.015444 | 4.794847 | 0.064946 | -0.039260 | 0.055422 | 1.271359 | -0.001104 | 0.002660 | 0.062576 | 0.073505 | 0.001974 | 0.000101 | 0.143271 | 0.059929 | -0.012726 | -0.028438 |
250 | 0.050824 | 0.041704 | -0.001928 | 0.026821 | 0.068230 | 0.004078 | -0.005798 | 0.012790 | 0.056719 | -0.004859 | -0.030977 | -0.032491 | 0.055306 | -0.002700 | 0.011684 | -0.014231 | -0.013565 | 0.157667 | -0.002660 | 0.012759 | 0.037215 | 1.163865 | -0.042082 | 0.000762 | -0.004521 | 0.000295 | 0.025304 | 0.025074 | 0.110253 | -0.003577 | -0.021062 | -0.002031 | -0.042702 | 0.210648 | 0.029303 | 0.024987 | 0.029177 | -0.007469 | -0.085974 | 0.003285 | 0.008513 | 0.019579 | -0.028265 | -0.001836 | 0.024202 | 0.041959 | 0.022740 | 0.008251 | 0.023698 | 0.029016 | -0.022381 | 0.035931 | -0.014534 |
4348 | 0.002091 | 0.021157 | -0.052133 | 0.067661 | 0.021547 | 0.105207 | 0.063571 | -0.012539 | 0.688057 | 0.227734 | -0.016318 | 0.085618 | 0.022404 | -0.013892 | -0.001535 | 0.040991 | 0.048995 | 0.009405 | -0.065042 | 0.011419 | 0.065410 | 8.538500 | -0.046427 | 0.209997 | -0.037722 | 0.039192 | 0.032267 | 0.047392 | 0.020943 | 0.140314 | 0.056047 | -0.014708 | 0.073953 | 3.044143 | 0.071692 | 0.024293 | 0.017528 | 0.029569 | 6.945174 | -0.046595 | -0.050118 | 0.181107 | 0.006423 | 0.076210 | -0.002631 | 0.101224 | -0.053285 | 0.037069 | 0.032636 | 0.002911 | -0.055538 | -0.023025 | 0.022613 |
2791 | 0.040069 | -0.051707 | 0.522409 | 0.005746 | -0.036767 | 0.013612 | 0.007747 | 0.021827 | 0.220459 | 0.013104 | -0.059429 | 0.006422 | 0.017736 | -0.022348 | 0.024529 | 0.017214 | 0.129793 | 0.000732 | 0.026058 | -0.002066 | 0.028572 | -0.207911 | 0.011993 | 0.118774 | -0.012681 | 0.005632 | -0.027632 | 0.022893 | 0.030939 | 0.048363 | -0.000151 | -0.004408 | 0.044600 | -0.108098 | -0.047453 | -0.020662 | 0.081641 | -0.001765 | -0.634543 | -0.003440 | 0.134448 | 0.051929 | -0.045280 | -0.013557 | -0.024167 | -0.054341 | -0.029107 | -0.012474 | 0.043636 | 0.007564 | 0.074074 | 0.055662 | 0.115290 |
2275 | -0.041459 | 0.011008 | -0.055237 | 0.047922 | 0.033297 | 0.022879 | 0.000088 | 0.003876 | -0.152558 | -0.018586 | -0.029864 | -0.005584 | 0.080266 | 0.015952 | 0.052953 | 0.304258 | 0.000763 | 0.036425 | 0.027420 | -0.002664 | 0.001505 | 0.206638 | -0.005536 | 0.014174 | -0.048500 | 0.003640 | 0.054400 | 0.084964 | -0.079203 | -0.002987 | -0.021273 | 0.011225 | 0.081400 | 9.160343 | 0.002530 | -0.004434 | 0.084411 | -0.034244 | 3.382523 | -0.076132 | 0.019194 | 0.007613 | 0.793798 | 0.016355 | -0.021784 | -0.118896 | -0.036156 | -0.029250 | 0.008419 | -0.041697 | 0.029119 | 0.057440 | -0.000388 |
3364 | 0.001590 | 0.029114 | 0.042443 | 0.026074 | 0.015629 | 0.073447 | 0.041900 | -0.013303 | 0.676601 | -0.043063 | -0.067287 | 0.030950 | 0.020750 | 0.131443 | -0.005691 | 0.051818 | 0.044800 | -0.022882 | 0.052791 | 0.004855 | 0.006176 | 0.276550 | -0.011573 | 0.117979 | 0.087262 | -0.063567 | -0.035908 | 0.063256 | 0.187598 | -0.026538 | -0.023886 | 0.026383 | -0.093770 | 3.343860 | 0.135942 | -0.005243 | 0.105701 | 0.042795 | 3.330865 | 0.122585 | 0.279359 | -0.038560 | 0.094401 | 0.058020 | 0.921130 | 0.017249 | 0.074355 | -0.028635 | 0.155254 | 0.064325 | 0.096806 | 0.078105 | -0.023347 |
757 | 0.112625 | 0.010699 | 0.073294 | 0.049768 | -0.023770 | 0.041888 | -0.012735 | 0.001733 | -0.090540 | 0.015469 | -0.038199 | 0.041567 | -0.045520 | 0.043685 | 0.033780 | 0.012523 | 0.109042 | 0.044897 | 0.005211 | -0.001340 | 0.027616 | -0.468245 | 0.015574 | -0.019447 | 0.003821 | 0.023700 | 0.070535 | -0.001122 | -0.166997 | 0.303648 | -0.061507 | -0.000077 | 0.286029 | 0.148413 | 0.046874 | -0.018950 | 0.042764 | 0.091724 | -0.228675 | -0.008245 | 0.064719 | 0.005560 | 0.844696 | 0.060548 | 0.061952 | 0.296216 | 0.006011 | 0.028035 | -0.032624 | -0.001236 | 0.109000 | 0.060241 | 0.108687 |
1231 | 0.123487 | 0.012015 | 0.074686 | -0.057958 | -0.062489 | -0.005719 | -0.018863 | 0.001873 | -0.098702 | -0.080393 | 0.000174 | -0.011449 | 0.003779 | 0.140177 | -0.085526 | -0.002940 | -0.034404 | 0.018371 | 0.066341 | 0.029739 | -0.001750 | -1.079020 | 0.018562 | -0.029877 | 0.012423 | -0.036075 | 0.014829 | -0.081603 | 0.049143 | -0.048742 | 0.000211 | 0.011385 | 0.118326 | 18.285403 | 0.161440 | 0.022849 | 0.023001 | -0.016845 | 3.984924 | 0.105866 | -0.100968 | 0.022498 | -0.065443 | 0.112538 | -0.007311 | -0.010751 | -0.044869 | -0.029348 | 0.209742 | -0.017102 | 0.059553 | 0.094675 | 0.002495 |
2861 | 0.006513 | 0.032145 | 0.041042 | -0.004228 | -0.028449 | -0.035138 | -0.062131 | 0.024852 | -0.018202 | 0.017064 | 0.060212 | 0.023772 | 0.017543 | -0.024719 | -0.037359 | -0.009053 | -0.038685 | -0.019768 | -0.009433 | 0.008406 | -0.001899 | -0.498624 | -0.002557 | 0.045878 | -0.025913 | -0.011032 | 0.016700 | 0.026807 | 0.002828 | -0.028792 | 0.005283 | -0.015849 | 10.992483 | 0.171739 | -0.003859 | -0.023945 | -0.010824 | -0.009832 | -0.313828 | -0.006598 | 0.032851 | 0.043762 | -0.370227 | -0.058354 | 0.001721 | 0.001313 | 0.038282 | 0.030872 | 0.020568 | 0.033622 | -0.072347 | 0.012223 | -0.043466 |
819 | -0.074397 | 4.207817 | -0.056807 | -0.002954 | 0.009688 | -0.023455 | 0.033242 | 0.042787 | 0.125576 | -0.009704 | 0.022180 | 0.032723 | -0.005124 | -0.047685 | -0.042669 | -0.060779 | 0.012423 | 0.011754 | 0.133128 | 0.024596 | 0.114497 | 4.696098 | -0.022374 | 0.007295 | 0.039370 | -0.018408 | -0.010090 | 0.072236 | 1.006231 | -0.007267 | 0.018647 | -0.036074 | 0.042228 | 0.000105 | 0.058423 | -0.030346 | 0.020053 | 0.026313 | 0.569307 | -0.038953 | 0.005391 | 0.014134 | 0.141457 | 0.355020 | 0.025469 | 0.069253 | 0.020161 | 0.001198 | 0.328697 | 0.024548 | -0.010794 | 0.002652 | -0.013257 |
And y
is just the image_name
column (it could also be the image_index
column if you want a numeric value instead of a string to represent the image identity)
y = features_and_labels['image_name']
y.sample(10)
stimulus_presentations_id 3537 im073 2109 im000 34 im031 489 im106 4206 im106 3903 im054 1618 im035 1385 im106 4786 im106 2795 im054 Name: image_name, dtype: object
Now we can use t-SNE, which will project our 53-dimensional feature space (53 neurons in the session) into two dimensions.
X_embedded = TSNE(n_components=2).fit_transform(X.values)
And visualize the results, with colors representing each unique stimulus.
features_and_labels['tsne-2d-one'] = X_embedded[:, 0]
features_and_labels['tsne-2d-two'] = X_embedded[:, 1]
plt.figure(figsize=(16, 10))
ax = sns.scatterplot(
data=features_and_labels,
x="tsne-2d-one",
y="tsne-2d-two",
hue="image_name",
hue_order=np.sort(features_and_labels['image_name'].unique()),
palette=sns.color_palette()[:9],
legend="full",
alpha=0.3
)
This demonstrates that the time-averaged population responses to at least some of the stimuli seem to fall into distinct clusters in our 53-dimensional space, while others appear more overlapped. This implies that a decoding analysis might be more successful at decoding some stimuli than others.
We can use an SVM decoder from scikit learn to ask how well we can decode image identity from the feature matrix we have constructed.
Split our data into train and test sets, instantiate the model, then fit.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
model = svm.SVC(probability=True)
model.fit(X_train, y_train)
SVC(probability=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(probability=True)
Use the model to make predictions on the held-out test set
y_pred = model.predict(X_test)
Evaluate the accuracy
accuracy_score(y_test, y_pred)
0.6216897856242118
Evaluate the confusion matrix
pd.DataFrame(
confusion_matrix(y_test, y_pred),
columns=['predicted_{}'.format(im) for im in model.classes_],
index=['actual_{}'.format(im) for im in model.classes_]
)
predicted_im000 | predicted_im031 | predicted_im035 | predicted_im045 | predicted_im054 | predicted_im073 | predicted_im075 | predicted_im106 | predicted_omitted | |
---|---|---|---|---|---|---|---|---|---|
actual_im000 | 92 | 4 | 55 | 1 | 19 | 26 | 0 | 0 | 0 |
actual_im031 | 17 | 86 | 36 | 0 | 7 | 34 | 0 | 0 | 2 |
actual_im035 | 5 | 19 | 124 | 0 | 5 | 23 | 0 | 2 | 0 |
actual_im045 | 0 | 18 | 7 | 161 | 0 | 14 | 0 | 0 | 1 |
actual_im054 | 35 | 7 | 26 | 0 | 94 | 24 | 0 | 0 | 0 |
actual_im073 | 9 | 14 | 48 | 0 | 4 | 100 | 0 | 1 | 0 |
actual_im075 | 2 | 9 | 21 | 0 | 2 | 35 | 124 | 1 | 0 |
actual_im106 | 0 | 5 | 8 | 0 | 1 | 10 | 1 | 194 | 0 |
actual_omitted | 9 | 11 | 8 | 1 | 6 | 6 | 0 | 1 | 11 |
This tells us that the model can decode some stimuli well (im035, im075 and im106, for example), while it struggles more with others (im000 and omissions, for example). Do the stimuli that the decoder succeeds in classifying align with those that cluster cleanly in t-SNE space?
Can you create event triggered averages and perform decoding using other events of interest, such as licks or rewards?
# Lick and reward data are available for each experiment
licks = experiments[ophys_experiment_id].licks
licks.head()
timestamps | frame | |
---|---|---|
0 | 68.90307 | 3499 |
1 | 77.14313 | 3993 |
2 | 84.09879 | 4410 |
3 | 85.31647 | 4483 |
4 | 94.64071 | 5042 |
rewards = experiments[ophys_experiment_id].rewards
rewards.head()
volume | timestamps | auto_rewarded | |
---|---|---|---|
0 | 0.005 | 318.95740 | True |
1 | 0.005 | 328.69873 | True |
2 | 0.005 | 337.73943 | True |
3 | 0.005 | 354.25289 | True |
4 | 0.005 | 364.74479 | True |
To see the full list of all attributes available for each experiment via the AllenSDK, uncomment the cell below and run it
# help(experiments[ophys_experiment_id])