Now we have an idea of three important components to analyzing neuroimaging data:
In this notebook the goal is to integrate these 3 basic components and perform a full analysis of group data using Intranetwork Functional Connectivity (FC).
Intranetwork functional connectivity is essentially a result of performing correlational analysis on mean signals extracted from two ROIs. Using this method we can examine how well certain resting state networks, such as the Default Mode Network (DMN), are synchronized across spatially distinct regions.
ROI-based correlational analysis forms the basis of many more sophisticated kinds of functional imaging analysis.
Nilearn has a built in function for extracting timeseries from functional files and doing all the extra signal processing at the same time. Let's walk through how this is done
First we'll grab our imports as usual
from nilearn import image as nimg
from nilearn import plotting as nplot
import numpy as np
import pandas as pd
from bids import BIDSLayout
Let's grab the data that we want to perform our connectivity analysis on using PyBIDS:
#Use PyBIDS to parse BIDS data structure
fmriprep_dir = "../data/ds000030/derivatives/fmriprep/"
layout = BIDSLayout(fmriprep_dir,
config=['bids','derivatives'])
#Get resting state data (preprocessed, mask, and confounds file)
func_files = layout.get(datatype='func', task='rest',
desc='preproc',
space='MNI152NLin2009cAsym',
extension='nii.gz',
return_type='file')
mask_files = layout.get(datatype='func', task='rest',
desc='brain',
suffix="mask",
space='MNI152NLin2009cAsym',
extension='nii.gz',
return_type='file')
confound_files = layout.get(datatype='func',
task='rest',
desc='confounds',
extension='tsv',
return_type='file')
Now that we have a list of subjects to peform our analysis on, let's load up our parcellation template file
#Load separated parcellation
parcel_file = '../resources/rois/yeo_2011/Yeo_JNeurophysiol11_MNI152/relabeled_yeo_atlas.nii.gz'
yeo_7 = nimg.load_img(parcel_file)
Now we'll import a package from nilearn
, called input_data
which allows us to pull data using the parcellation file, and at the same time applying data cleaning!
We first create an object using the parcellation file yeo_7
and our cleaning settings which are the following:
Settings to use:
The object masker
is now able to be used on any functional image of the same size. The input_data.NiftiLabelsMasker
object is a wrapper that applies parcellation, cleaning and averaging to an functional image. For example let's apply this to our first subject:
# Pull the first subject's data
func_file = func_files[0]
mask_file = mask_files[0]
confound_file = confound_files[0]
Before we go ahead and start using the masker
that we've created, we have to do some preparatory steps. The following should be done prior to use the masker
object:
To help us with the first part, let's define a function to help extract our confound regressors from the .tsv file for us. Note that we've handled pulling the appropriate {confounds}_derivative1
columns for you! You just need to supply the base regressors!
#Refer to part_06 for code + explanation
def extract_confounds(confound_tsv,confounds,dt=True):
'''
Arguments:
confound_tsv Full path to confounds.tsv
confounds A list of confounder variables to extract
dt Compute temporal derivatives [default = True]
Outputs:
confound_mat
'''
if dt:
dt_names = ['{}_derivative1'.format(c) for c in confounds]
confounds = confounds + dt_names
#Load in data using Pandas then extract relevant columns
confound_df = pd.read_csv(confound_tsv,delimiter='\t')
confound_df = confound_df[confounds]
#Convert into a matrix of values (timepoints)x(variable)
confound_mat = confound_df.values
#Return confound matrix
return confound_mat
Finally we'll set up our image file for confound regression (as we did in Episode 6). To do this we'll drop 4 TRs from both our functional image and our confounds file. Note that our masker
object will not do this for us!
#Load functional image
#Remove the first 4 TRs
#Use the above function to pull out a confound matrix
#Drop the first 4 rows of the confounds matrix
Finally with everything set up, we can now use the masker to perform our:
All in one step!
#Apply cleaning, parcellation and extraction to functional data
Just to be clear, this data is automatically parcellated for you, and, in addition, is cleaned using the confounds you've specified!
The result of running masker.fit_transform
is a matrix that has:
But wait!
We originally had 50 ROIs, what happened to 7 of them? It turns out that masker
drops ROIs that are empty (i.e contain no brain voxels inside of them), this means that 7 of our atlas' parcels did not correspond to any region with signal! To see which ROIs are kept after computing a parcellation you can look at the labels_
property of masker
:
print(masker.labels_)
print("Number of labels", len(masker.labels_))
This means that our ROIs of interest (44 and 46) cannot be accessed using the 44th and 46th columns directly!
There are many strategies to deal with this weirdness. What we're going to do is to create a new array that fills in the regions that were removed with 0
values. It might seem a bit weird now, but it'll simplify things when we start working with multiple subjects!
First we'll identify all ROIs from the original atlas. We're going to use the numpy
package which will provide us with functions to work with our image arrays:
import numpy as np
# Get the label numbers from the atlas
# Get number of labels that we have
Now we're going to create an array that contains:
# Remember fMRI images are of size (x,y,z,t)
# where t is the number of timepoints
# Create an array of zeros that has the correct size
# Get regions that are kept
# Fill columns matching labels with signal values
print(final_signal.shape)
It's a bit of work, but now we have an array where:
masker.fit_transform
is filled with 0
values!To get the columns corresponding to the regions that we've kept, we can simply use the regions_kept
variable to select columns corresponding to the regions that weren't removed:
This is identical to the original output of masker.fit_transform
This might seem unnecessary for now, but as you'll see in a bit, it'll come in handy when we deal with multiple subjects!
In fMRI imaging, connectivity typically refers to the correlation of the timeseries of 2 ROIs. Therefore we can calculate a full connectivity matrix by computing the correlation between all pairs of ROIs in our parcellation scheme!
We'll use another nilearn tool called ConnectivityMeasure
from nilearn.connectome
. This tool will perform the full set of pairwise correlations for us
Like the masker, we need to make an object that will calculate connectivity for us.
Try using SHIFT-TAB
to see what options you can put into the kind
argument of ConnectivityMeasure
Then we use correlation_measure.fit_transform()
in order to calculate the full correlation matrix for our parcellated data!
Note that we're using a list [final_signal]
, this is because correlation_measure
works on a list of subjects. We'll take advantage of this later!
The result is a matrix which has:
You can read this correlation matrix as follows:
Suppose we wanted to know the correlation between ROI 30 and ROI 40
This is because the correlation of $A \to B = B \to A$
NOTE
Remember we were supposed to lose 7 regions from the masker.fit_transform
step. The correlations for these regions will be 0!
Let's try pulling the correlation for ROI 44 and 46!
Note that it'll be the same if we swap the rows and columns!
Apply the data extract process shown above to all subjects in our subject list and collect the results. Your job is to fill in the blanks!
# First we're going to create some empty lists to store all our data in!
ctrl_subjects = []
schz_subjects = []
# We're going to keep track of each of our subjects labels here
# pulled from masker.labels_
labels_list = []
# Get the number of unique labels in our parcellation
# We'll use this to figure out how many columns to make (as we did earlier)
atlas_labels = np.unique(yeo_7.get_fdata().astype(int))
NUM_LABELS = len(atlas_labels)
# Set the list of confound variables we'll be using
confound_variables = ['trans_x','trans_y','trans_z',
'rot_x','rot_y','rot_z',
'global_signal',
'white_matter','csf']
# Number of TRs we should drop
TR_DROP=4
# Lets get all the subjects we have
subjects = layout.get_subjects()
for sub in subjects:
#Get the functional file for the subject (MNI space)
func_file = layout.get(subject=??,
datatype='??', task='rest',
desc='??',
space='??'
extension="nii.gz",
return_type='file')[0]
#Get the confounds file for the subject (MNI space)
confound_file=layout.get(subject=??, datatype='??',
task='rest',
desc='??',
extension='tsv',
return_type='file')[0]
#Load the functional file in
func_img = nimg.load_img(??)
#Drop the first 4 TRs
func_img = func_img.slicer[??,??,??,??]
#Extract the confound variables using the function
confounds = extract_confounds(confound_file,
confound_variables)
#Drop the first 4 rows from the confound matrix
confounds = confounds[??]
# Make our array of zeros to fill out
# Number of rows should match number of timepoints
# Number of columns should match the total number of regions
fill_array = np.zeros((func_img.shape[??], ??))
#Apply the parcellation + cleaning to our data
#What function of masker is used to clean and average data?
time_series = masker.fit_transform(??,??)
# Get the regions that were kept for this scan
regions_kept = np.array(masker.labels_)
# Fill the array, this is what we'll use
# to make sure that all our array are of the same size
fill_array[:, ??] = time_series
#If the subject ID starts with a "1" then they are control
if sub.startswith('1'):
ctrl_subjects.append(fill_array)
#If the subject ID starts with a "5" then they are case (case of schizophrenia)
if sub.startswith('5'):
schz_subjects.append(fill_array)
labels_list.append(masker.labels_)
The result of all of this code is that:
ctrl_subjects
listschz_subjects
listWhat's actually being placed into the list? The cleaned, parcellated time series data for each subject (the output of masker.fit_transform
)!
A helpful trick is that we can re-use the correlation_measure
object we made earlier and apply it to a list of subject data!
At this point, we have correlation matrices for each subject across two populations. The final step is to examine the differences between these groups in their correlation between ROI 44 and ROI 46.
An important step in any analysis is visualizing the data that we have. We've cleaned data, averaged data and calculated correlations but we don't actually know what it looks like! Visualizing data is important to ensure that we don't throw pure nonsense into our final statistical analysis
To visualize data we'll be using a python package called seaborn
which will allow us to create statistical visualizations with not much effort.
import seaborn as sns
import matplotlib.pyplot as plt
We can view a single subject's correlation matrix by using seaborn
's heatmap
function:
Recall that cleaning and parcellating the data causes some ROIs to get dropped. We dealt with this by filling an array of zeros (fill_array
) only for columns where the regions are kept (regions_kept
). This means that we'll have some correlation values that are 0!
This is more apparent if we plot the data slightly differently. For demonstrative purposes we've:
sns.heatmap(np.abs(ctrl_correlation_matrices[0]), cmap='viridis')
The dark lines in the correlation matrix correspond to regions that were dropped and therefore have 0 correlation!
We can now pull our ROI 44 and 46 by indexing our list of correlation matrices as if it were a 3D array (kind of like an MR volume). Take a look at the shape:
This is of form:
ctrl_correlation_matrices[subject_index, row_index, column_index]
Now we're going to pull out just the correlation values between ROI 44 and 46 across all our subjects. This can be done using standard array indexing:
Next we're going to arrange this data into a table. We'll create two tables (called dataframes in the python package we're using, pandas
)
#Create control dataframe
# Create the schizophrenia dataframe
The result is:
ctrl_df
a table containing the correlation value for each control subject, with an additional column with the group label, which is 'control'
scz_df
a table containing the correlation value for each schizophrenia group subject, with an additional column with the group label, which is 'schizophrenia'
For visualization we're going to stack the two tables together...
#Stack the two dataframes together
# Show some random samples from dataframe
Finally we're going to visualize the results using the python package seaborn
!
#Visualize results
# Create a figure canvas of equal width and height
# Create a box plot, with the x-axis as group
#the y-axis as the correlation value
# Create a "swarmplot" as well, you'll see what this is..
# Set the title and labels of the figure
Although the results here aren't significant they seem to indicate that there might be three subclasses in our schizophrenia group - of course we'd need a lot more data to confirm this! The interpretation of these results should ideally be based on some a priori hypothesis!
Hopefully now you understand that:
nilearn
and nibabel
nilearn