The goal of this presentation is to familiarize you with some broad classes of neuroimaging data that can be interacted with in Python, using the NiBabel library.
This document is intended to be viewed as a RISE presentation. It works fine as a notebook, but blocks with images may look strange because they are formatted to work as slides.
NiBabel is a low-level Python library that gives access to a variety of imaging formats, with a particular focus on providing a common interface to the various volumetric formats produced by scanners and used in common neuroimaging toolkits:
NIfTI-1 | NIfTI-2 | MGH |
MINC 1.0 | MINC 2.0 | AFNI BRIK/HEAD |
ANALYZE | SPM99 ANALYZE | SPM2 ANALYZE |
DICOM | PAR/REC | ECAT |
It also supports surface file formats:
GIFTI | FreeSurfer (FS) geometry |
FS labels | FS annotations |
Tractography files:
TrackVis (TRK) | MRtrix (TCK) |
As well as the CIFTI-2 format for composite volume/surface data.
NiBabel is available on PyPI:
pip install nibabel
And conda-forge:
conda install -c conda-forge nibabel
Note: This notebook assumes NiBabel 3+, which requires a minimum Python version of 3.5.
import nibabel as nb
print(nb.__version__)
3.1.1
# Some additional, useful imports
from pathlib import Path # Combine path elements with /
from pprint import pprint # Pretty-printing
import numpy as np # Numeric Python
from matplotlib import pyplot as plt # Matlab-ish plotting commands
from nilearn import plotting as nlp # Nice neuroimage plotting
import transforms3d # Work with affine algebra
from scipy import ndimage as ndi # Operate on N-dimensional images
import nibabel.testing # For fetching test data
%pylab inline
Populating the interactive namespace from numpy and matplotlib
# Assume we're on the NeuroHackademy hub.
data_dir = Path('/home/jovyan/data')
SpatialImage
API and identify its componentst1w = nb.load(data_dir / 'openneuro/ds000114/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz')
bold = nb.load(data_dir / 'openneuro/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz')
print(t1w)
<class 'nibabel.nifti1.Nifti1Image'> data shape (256, 156, 256) affine: [[ 9.99131918e-01 -5.16291820e-02 1.25127016e-02 -1.25263863e+02] [ 4.07721959e-02 1.29202044e+00 -9.81179178e-02 -7.31330109e+01] [-8.54506902e-03 1.28044292e-01 9.95096147e-01 -1.77554291e+02] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]] metadata: <class 'nibabel.nifti1.Nifti1Header'> object, endian='<' sizeof_hdr : 348 data_type : b'' db_name : b'' extents : 0 session_error : 0 regular : b'r' dim_info : 0 dim : [ 3 256 156 256 1 1 1 1] intent_p1 : 0.0 intent_p2 : 0.0 intent_p3 : 0.0 intent_code : none datatype : float32 bitpix : 32 slice_start : 0 pixdim : [1. 1. 1.2993759 1. 0.009668 0. 0. 0. ] vox_offset : 0.0 scl_slope : nan scl_inter : nan slice_end : 0 slice_code : unknown xyzt_units : 10 cal_max : 0.0 cal_min : 0.0 slice_duration : 0.0 toffset : 0.0 glmax : 0 glmin : 0 descrip : b'FSL5.0' aux_file : b'' qform_code : scanner sform_code : scanner quatern_b : 0.049235623 quatern_c : 0.005271982 quatern_d : 0.020155331 qoffset_x : -125.26386 qoffset_y : -73.13301 qoffset_z : -177.55429 srow_x : [ 9.9913192e-01 -5.1629182e-02 1.2512702e-02 -1.2526386e+02] srow_y : [ 4.0772196e-02 1.2920204e+00 -9.8117918e-02 -7.3133011e+01] srow_z : [-8.5450690e-03 1.2804429e-01 9.9509615e-01 -1.7755429e+02] intent_name : b'' magic : b'n+1'
All NiBabel images have a .to_filename()
method:
t1w.to_filename('/tmp/img.nii.gz')
nibabel.save
will attempt to convert to a reasonable image type, if the extension doesn't match:
nb.save(t1w, '/tmp/img.mgz')
Some image types separate header and data into two separate images. Saving to either filename will generate both.
nb.save(t1w, '/tmp/img.img')
print(nb.load('/tmp/img.hdr'))
<class 'nibabel.nifti1.Nifti1Pair'> data shape (256, 156, 256) affine: [[ 9.99131918e-01 -5.16291820e-02 1.25127016e-02 -1.25263863e+02] [ 4.07721959e-02 1.29202044e+00 -9.81179178e-02 -7.31330109e+01] [-8.54506902e-03 1.28044292e-01 9.95096147e-01 -1.77554291e+02] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]] metadata: <class 'nibabel.nifti1.Nifti1PairHeader'> object, endian='<' sizeof_hdr : 348 data_type : b'' db_name : b'' extents : 0 session_error : 0 regular : b'r' dim_info : 0 dim : [ 3 256 156 256 1 1 1 1] intent_p1 : 0.0 intent_p2 : 0.0 intent_p3 : 0.0 intent_code : none datatype : float32 bitpix : 32 slice_start : 0 pixdim : [1. 1. 1.2993759 1. 1. 1. 1. 1. ] vox_offset : 0.0 scl_slope : nan scl_inter : nan slice_end : 0 slice_code : unknown xyzt_units : 10 cal_max : 0.0 cal_min : 0.0 slice_duration : 0.0 toffset : 0.0 glmax : 0 glmin : 0 descrip : b'FSL5.0' aux_file : b'' qform_code : scanner sform_code : scanner quatern_b : 0.049235623 quatern_c : 0.005271982 quatern_d : 0.020155331 qoffset_x : -125.26386 qoffset_y : -73.13301 qoffset_z : -177.55429 srow_x : [ 9.9913192e-01 -5.1629182e-02 1.2512702e-02 -1.2526386e+02] srow_y : [ 4.0772196e-02 1.2920204e+00 -9.8117918e-02 -7.3133011e+01] srow_z : [-8.5450690e-03 1.2804429e-01 9.9509615e-01 -1.7755429e+02] intent_name : b'' magic : b'ni1'
Some NiBabel images can be serialized to and deserialized from byte strings, for performing stream-based I/O.
bstr = t1w.to_bytes()
print(bstr[:100])
b'\\\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00r\x00\x03\x00\x00\x01\x9c\x00\x00\x01\x01\x00\x01\x00\x01\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00 \x00\x00\x00\x00\x00\x80?\x00\x00\x80?\xf3Q\xa6?\x00\x00\x80?\x88f\x1e<\x00\x00\x00\x00'
new_t1w = nb.Nifti1Image.from_bytes(bstr)
Images that save to single files can generally be serialized. NIfTI-1/2, GIFTI and MGH are currently supported.
For MRI studies, neuroimaging data is typically acquired as one or more volumes, a regular grid of values. NiBabel represents these data as spatial images, a structure containing three things:
Many file types encode this basic structure. NiBabel will read any of ANALYZE (plain, SPM99, SPM2 and later), NIfTI-1, NIfTI-2, MINC 1.0, MINC 2.0, AFNI BRIK/HEAD, MGH, ECAT, DICOM and Philips PAR/REC, and expose a simple interface:
data = t1w.get_fdata()
affine = t1w.affine
header = t1w.header
Spatial images have some properties that should be familiar from NumPy arrays:
print(t1w.ndim)
print(t1w.shape)
3 (256, 156, 256)
The data is a simple NumPy array. It can be accessed, sliced and generally manipulated as you would any array:
print(data.shape)
i, j, k = np.array(data.shape) // 2
fig, axes = plt.subplots(1, 3)
axes[0].imshow(data[i,:,:].T, cmap='Greys_r', origin='lower')
axes[1].imshow(data[:,j,:].T, cmap='Greys_r', origin='lower')
_ = axes[2].imshow(data[:,:,k].T, cmap='Greys_r', origin='lower')
(256, 156, 256)
Each location in the image data array is a voxel (pixel with a volume), and can be referred to with voxel coordinates (array indices).
This is a natural way to describe a block of data, but is practically meaningless with regard to anatomy.
NiBabel has a basic viewer that scales voxels to reflect their size and labels orientations.
_ = t1w.orthoview() # Requires matplotlib, occasionally glitchy in OSX setups
The crosshair is focused at the origin $(0, 0, 0)$.
All of this information is encoded in the affine:
print(affine)
[[ 9.99131918e-01 -5.16291820e-02 1.25127016e-02 -1.25263863e+02] [ 4.07721959e-02 1.29202044e+00 -9.81179178e-02 -7.31330109e+01] [-8.54506902e-03 1.28044292e-01 9.95096147e-01 -1.77554291e+02] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
The affine is a 4 x 4 numpy array. This describes the transformation from the voxel space (indices $(i, j, k)$) to a reference space (coordinates $(x, y, z)$). These coordinates are, by convention, distance in mm right, anterior and superior of an origin*.
$$ \begin{bmatrix} x\\ y\\ z\\ 1\\ \end{bmatrix} = \mathbf A \begin{bmatrix} i\\ j\\ k\\ 1\\ \end{bmatrix} =\begin{bmatrix} m_{1,1} & m_{1,2} & m_{1,3} & a \\ m_{2,1} & m_{2,2} & m_{2,3} & b \\ m_{3,1} & m_{3,2} & m_{3,3} & c \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} i\\ j\\ k\\ 1\\ \end{bmatrix} $$For an unmodified image, this reference space typically refers to an origin in the isocenter of the imaging magnet, and the directions right, anterior and superior are defined assuming a subject is lying in the scanner, face up.
* If a file format uses an alternative convention, NiBabel converts on read/write, so affines are always RAS+.
An affine transformation can be decomposed into translation, rotation, scaling (zooms) and shear transformations, applied right-to-left.
T, R, Z, S = transforms3d.affines.decompose44(affine) # External library
print(f"Translation: {T}\nRotation:\n{R}\nZooms: {Z}\nMax shear: {np.abs(S).max()}")
Translation: [-125.26386261 -73.13301086 -177.55429077] Rotation: [[ 0.99913193 -0.03973383 0.01251451] [ 0.0407722 0.99433924 -0.09811785] [-0.00854507 0.09854292 0.99509611]] Zooms: [0.99999999 1.2993759 1.00000002] Max shear: 1.8087155968217092e-06
Tm, Rm, Zm, Sm = [np.eye(4) for _ in range(4)]
Tm[:3, 3] = T
Rm[:3, :3] = R
Zm[:3, :3] = np.diag(Z)
Sm[[0, 0, 1], [1, 2, 2]] = S
np.allclose(Tm @ Rm @ Zm @ Sm, affine)
True
NiBabel provides functions for extracting information from affines:
print("Orientation:", nb.aff2axcodes(affine))
print("Zooms:", nb.affines.voxel_sizes(affine))
print("Obliquity:", nb.affines.obliquity(affine))
Orientation: ('R', 'A', 'S') Zooms: [0.99999999 1.2993759 1.00000002] Obliquity: [0.04167007 0.10645293 0.09907456]
You can also use it to answer specific questions. For instance, the inverse affine allows you to calculate indices from RAS coordinates and look up the image intensity value at that location.
i, j, k, _ = np.linalg.pinv(affine) @ [0, 0, 0, 1]
print(f"Center: ({int(i)}, {int(j)}, {int(k)})")
print(f"Value: ", data[int(i), int(j), int(k)])
Center: (126, 65, 171) Value: 312.0
If you didn't follow all of the above, that's okay. Here are the important points:
You'll go over this again with Noah Benson in Introduction to the Geometry and Structure of the Human Brain.
Matthew Brett's Coordinate systems and affines tutorial is an excellent resource.
The image header is specific to each file format. Minimally, it contains information to construct the data and affine arrays, but some methods are useful beyond that.
get_zooms()
method returns voxel sizes. If the data is 4D, then repetition time is included as a temporal zoom.get_data_dtype()
returns the numpy data type in which the image data is stored (or will be stored when the image is saved).print(bold.header.get_zooms())
print(bold.header.get_data_dtype())
(4.0, 4.0, 3.999975, 2.5) int16
For images with fixed header structures, header fields are exposed through a dict
-like interface.
print(header)
<class 'nibabel.nifti1.Nifti1Header'> object, endian='<' sizeof_hdr : 348 data_type : b'' db_name : b'' extents : 0 session_error : 0 regular : b'r' dim_info : 0 dim : [ 3 256 156 256 1 1 1 1] intent_p1 : 0.0 intent_p2 : 0.0 intent_p3 : 0.0 intent_code : none datatype : float32 bitpix : 32 slice_start : 0 pixdim : [1. 1. 1.2993759 1. 0.009668 0. 0. 0. ] vox_offset : 0.0 scl_slope : nan scl_inter : nan slice_end : 0 slice_code : unknown xyzt_units : 10 cal_max : 0.0 cal_min : 0.0 slice_duration : 0.0 toffset : 0.0 glmax : 0 glmin : 0 descrip : b'FSL5.0' aux_file : b'' qform_code : scanner sform_code : scanner quatern_b : 0.049235623 quatern_c : 0.005271982 quatern_d : 0.020155331 qoffset_x : -125.26386 qoffset_y : -73.13301 qoffset_z : -177.55429 srow_x : [ 9.9913192e-01 -5.1629182e-02 1.2512702e-02 -1.2526386e+02] srow_y : [ 4.0772196e-02 1.2920204e+00 -9.8117918e-02 -7.3133011e+01] srow_z : [-8.5450690e-03 1.2804429e-01 9.9509615e-01 -1.7755429e+02] intent_name : b'' magic : b'n+1'
print(header['descrip'])
b'FSL5.0'
The MGH header is similarly accessible, but its structure is quite different:
mghheader = nb.load('/tmp/img.mgz').header
print(mghheader)
<class 'nibabel.freesurfer.mghformat.MGHHeader'> object, endian='>' version : 1 dims : [256 156 256 1] type : 3 dof : 0 goodRASFlag : 1 delta : [1. 1.2993759 1. ] Mdc : [[ 0.9991319 0.0407722 -0.00854507] [-0.03973383 0.9943392 0.09854292] [ 0.0125127 -0.09811792 0.99509615]] Pxyz_c : [ 0.1995725 20.30433 -41.2883 ] tr : 0.0 flip_angle : 0.0 te : 0.0 ti : 0.0 fov : 0.0
print(mghheader['Pxyz_c'])
print(mghheader.get_zooms())
print(mghheader.get_data_dtype())
[ 0.1995725 20.30433 -41.2883 ] (1.0, 1.2993759, 1.0) >f4
Other formats use text key-value pairs (PAR/REC, BRIK/HEAD) or keys in NetCDF (MINC 1.0) or HDF5 (MINC 2.0) containers, and their values are accessible by other means.
Often, we're not particularly interested in the header, or even the affine. But it's important to know they're there and, especially, to remember to copy them when making new images, so that derivatives stay aligned with the original image.
Reading data will only take you so far. In many cases, you will want to use the image to compute a new image in the same space. Such a function might take the form:
def my_function(image):
orig_data = image.get_fdata()
new_data = some_operation_on(orig_data)
return image.__class__(new_data, affine=image.affine, header=image.header)
Note the image.__class__
ensures that the output image is the same type as the input. If your operation requires specific format features, you might use a specific class like nb.Nifti1Image
.
For example, perhaps we want to save space and rescale our T1w image to an unsigned byte:
def rescale(img):
data = img.get_fdata()
rescaled = ((data - data.min()) * 255. / (data.max() - data.min())).astype(np.uint8)
rescaled_img = img.__class__(rescaled, affine=img.affine, header=img.header)
rescaled_img.header.set_data_dtype('uint8') # Ensure data is saved as this type
return rescaled_img
rescaled_t1w = rescale(t1w)
rescaled_t1w.to_filename('/tmp/rescaled_t1w.nii.gz')
Recall that the spatial image contains data, affine and header objects. When creating a new image, the data
array is typically an array.
array_img = nb.Nifti1Image(np.arange(244, 268).reshape(2, 3, 4), affine=np.diag([2, 2, 2, 1]))
print(array_img.dataobj)
[[[244 245 246 247] [248 249 250 251] [252 253 254 255]] [[256 257 258 259] [260 261 262 263] [264 265 266 267]]]
When loading a file, an ArrayProxy
is used for most image types.
array_img.to_filename('/tmp/array_img.nii')
proxy_img = nb.load('/tmp/array_img.nii')
print(proxy_img.dataobj)
<nibabel.arrayproxy.ArrayProxy object at 0x7f0e2677c5b0>
An array proxy is an object that knows how to access data, but does not read it until requested. The usual way to request data is get_fdata()
, which returns all data as floating point:
proxy_img.get_fdata(dtype=np.float32) # The default is float64, but you can choose any floating point type.
memmap([[[244., 245., 246., 247.], [248., 249., 250., 251.], [252., 253., 254., 255.]], [[256., 257., 258., 259.], [260., 261., 262., 263.], [264., 265., 266., 267.]]], dtype=float32)
get_fdata()
provides a very predictable interface. When you need more control, you'll want to work with the ArrayProxy
directly.
Array proxies are designed to be one step away from a numpy array:
arr = np.asanyarray(proxy_img.dataobj) # array will create a copy; asarray passes through arrays; asanyarray passes subclasses like memmap through
print(arr.dtype)
arr
int64
memmap([[[244, 245, 246, 247], [248, 249, 250, 251], [252, 253, 254, 255]], [[256, 257, 258, 259], [260, 261, 262, 263], [264, 265, 266, 267]]])
Memory maps are arrays that remain on disk, rather than in RAM. This is only possible with uncompressed images.
We can also cast to any type we please, however unwisely.
print(np.uint8(proxy_img.dataobj)) # Values over 255 will be truncated
print(np.complex256(proxy_img.dataobj)) # A life less ordinal
[[[244 245 246 247] [248 249 250 251] [252 253 254 255]] [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]]] [[[244.+0.j 245.+0.j 246.+0.j 247.+0.j] [248.+0.j 249.+0.j 250.+0.j 251.+0.j] [252.+0.j 253.+0.j 254.+0.j 255.+0.j]] [[256.+0.j 257.+0.j 258.+0.j 259.+0.j] [260.+0.j 261.+0.j 262.+0.j 263.+0.j] [264.+0.j 265.+0.j 266.+0.j 267.+0.j]]]
One of the primary motivations for an array proxy is to avoid loading data unnecessarily. Accessing the array proxy with array indices or slices will return the requested values without loading other data:
print(proxy_img.dataobj[0])
print(proxy_img.dataobj[..., 1:3])
[[244 245 246 247] [248 249 250 251] [252 253 254 255]] [[[245 246] [249 250] [253 254]] [[257 258] [261 262] [265 266]]]
For example, this is useful for fetching a single volume from a BOLD series:
vol0 = bold.dataobj[..., 0]
vol0.shape
(64, 64, 30)
Slicing works with compressed data, as well. Install the indexed-gzip package for significant speedups with gzipped files.
Several image types, including NIfTI, have a slope and intercept (scaling factors) in the header that allows them to extend the data range and/or precision beyond those of the on-disk type.
For example uint16
can take the values 0-65535. If a BOLD series includes values from 500-2000, we can calculate a slope that will allow us to utilize the 16 bits of precision to encode our desired values.
pr = (500, 2000) # Plausible range
nbits = 16 # 16-bits of precision
scl_slope = (pr[1] - pr[0]) / (2 ** nbits) # Resolvable difference
scl_inter = pr[0] # Minimum value
print(scl_slope, scl_inter)
"Saving space by collapsing plotting into one line."; x = np.arange(2 ** nbits); plt.step(x, x * scl_slope + scl_inter); vlim = np.array([120, 150]); plt.xlim(vlim); plt.ylim(vlim * scl_slope + scl_inter); plt.xlabel("On-disk value"); plt.ylabel('"True" value'); _ = plt.show();
0.02288818359375 500
Let's create an image from some random values in our plausible range:
float_img = nb.Nifti1Image(np.random.default_rng().uniform(500, 2000, (2, 3, 4)), # 64-bit float
affine=np.diag([2, 2, 2, 1]))
print(float_img.get_fdata())
[[[ 840.39886182 1081.43237313 835.40125694 982.33174656] [1573.21571784 1261.74029839 600.53166939 1131.41563477] [1638.7430394 1913.94671681 1364.22730446 711.51449619]] [[ 642.17083878 1226.3384657 1562.95952987 1962.86543456] [1148.47532746 1302.16166916 877.44302527 1921.37256987] [1301.64831329 1167.99390276 1585.47466315 1442.73636725]]]
Save as uint16
and check its values:
float_img.header.set_data_dtype(np.uint16)
float_img.to_filename("/tmp/uint16_img.nii")
uint16_img = nb.load("/tmp/uint16_img.nii")
print(uint16_img.get_fdata())
[[[ 840.40302254 1081.43848911 835.3931435 982.32187811] [1573.21736212 1261.73177083 600.53167725 1131.41255221] [1638.74075933 1913.95150033 1364.23680624 711.51816763]] [[ 642.16980053 1226.33001149 1562.96893737 1962.86538155] [1148.47940154 1302.16419709 877.44702439 1921.37277342] [1301.64450009 1167.9992207 1585.48221122 1442.73184046]]]
We clearly lost some precision...
np.max(np.abs(float_img.get_fdata() - uint16_img.get_fdata()))
0.009868452619912205
But what's going on?
The ArrayProxy
keeps track of scaling factors:
print(f"Slope: {uint16_img.dataobj.slope}; Intercept: {uint16_img.dataobj.inter}")
print(uint16_img.dataobj.get_unscaled())
Slope: 0.020787879824638367; Intercept: 600.5316772460938 [[[11539 23134 11298 18366] [46791 31807 0 25538] [49943 63182 36738 5339]] [[ 2003 30104 46298 65535] [26359 33752 13321 63539] [33727 27298 47381 40514]]]
The scaling is done automatically when the data is accessed, by slice or whole.
print(np.asanyarray(uint16_img.dataobj))
print(uint16_img.dataobj[0, 0, 0])
[[[ 840.40302254 1081.43848911 835.3931435 982.32187811] [1573.21736212 1261.73177083 600.53167725 1131.41255221] [1638.74075933 1913.95150033 1364.23680624 711.51816763]] [[ 642.16980053 1226.33001149 1562.96893737 1962.86538155] [1148.47940154 1302.16419709 877.44702439 1921.37277342] [1301.64450009 1167.9992207 1585.48221122 1442.73184046]]] 840.4030225425959
The ArrayProxy
guarantees that the data has the intended value, but the type can vary based on the on-disk type and the values of scaling factors.
print(proxy_img.dataobj[0, 0, 0].dtype) # Our earlier integer image
print(uint16_img.dataobj[0, 0, 0].dtype)
int64 float64
get_fdata()
sweeps these details under the rug and always gives you the same type.
If you didn't follow all of the above, that's okay. Here are the important points:
img.get_fdata()
will fetch all of the data, and it will always be a floatimg.dataobj
exists if you want to load only some data or control the data typeIn the NiBabel docs, The image data array gives you an overview of both methods, and Images and memory has even more details.
Slicing array proxies is nice. Wouldn't it be nicer to keep track of the affine and header?
The slicer
attribute provides an interface that allows you to apply slices to an image, and updates the affine to ensure that the spatial information matches.
Consider the T1-weighted image from earlier:
_ = t1w.orthoview()
We can use the slicer to crop unneeded voxels in the left-right and inferior-superior directions:
cropped = t1w.slicer[40:216, :, 50:226]
cropped.orthoview()
<OrthoSlicer3D: (176, 156, 176)>
Note the origin crosshair points to the same structure. The affines now differ in translation:
print(cropped.affine - t1w.affine)
[[ 0. 0. 0. 40.5909118 ] [ 0. 0. 0. -3.27500805] [ 0. 0. 0. 49.41300459] [ 0. 0. 0. 0. ]]
You can even downsample an image, and the zooms will reflect the increased distance between voxels.
cheap_downsample = cropped.slicer[2::4, 2::4, 2::4]
print(cheap_downsample.header.get_zooms())
cheap_downsample.orthoview()
(4.0, 5.1975036, 4.0)
<OrthoSlicer3D: (44, 39, 44)>
Note that this is a bad idea in most circumstances because it induces aliasing.
The better approach would be to anti-alias and then slice:
def blur(img, sigma): # Isotropic in voxel space, not world space
return img.__class__(ndi.gaussian_filter(img.dataobj, sigma), img.affine, img.header)
better_downsample = blur(cropped, sigma=1.5).slicer[2::4, 2::4, 2::4]
better_downsample.orthoview()
<OrthoSlicer3D: (44, 39, 44)>
For non-spatial dimensions, slices or indices may be used to select one or more volumes.
tp15 = bold.slicer[..., :5]
tp1 = bold.slicer[..., 0]
print(f"BOLD shape: {bold.shape}; Zooms: {bold.header.get_zooms()}")
print(f"Time pts 1-5 shape: {tp15.shape}; Zooms: {tp15.header.get_zooms()}")
print(f"Time pt 1 shape: {tp1.shape}; Zooms: {tp1.header.get_zooms()}")
np.array_equal(tp15.get_fdata(), bold.dataobj[..., :5])
BOLD shape: (64, 64, 30, 184); Zooms: (4.0, 4.0, 3.999975, 2.5) Time pts 1-5 shape: (64, 64, 30, 5); Zooms: (4.0, 4.0, 3.999975, 2.5) Time pt 1 shape: (64, 64, 30); Zooms: (4.0, 4.0, 3.999975)
True
Aliasing considerations apply to time series as well, so be careful with down-sampling here, too.
Although the scanner samples data in three dimensions, some brain structures are better represented as a convoluted sheet than a volume. Data may be usefully resampled onto a cortical sheet, but in the process, it loses the intrinsic geometry of a 3D array.
To represent data on a surface, you need the following structures:
Unlike spatial images, these components are frequently kept in separate files.
Terminological note: You are likely to encounter multiple names for each of these. Vertices may be called coordinates or a point set. Faces are often called triangles. When a data array is plotted on the surface, it might be called a texture.
The two supported surface formats at present are GIFTI and FreeSurfer geometry files. Unlike spatial images, there is not yet a common interface for working with surface-based data.
FreeSurfer encodes a great deal of information in its directory structure and file names, allowing the necessary data arrays to have relatively simple formats.
GIFTI is more of an interchange format, and so has a rich metadata structure and can store different kinds of data.
To load a surface mesh in a FreeSurfer directory, use nb.freesurfer.read_geometry
:
fs_verts, fs_faces, fs_meta = nb.freesurfer.read_geometry(data_dir / 'ds005-preproc/freesurfer/sub-01/surf/lh.pial', read_metadata=True)
print(fs_verts[:2])
print(fs_faces[:2])
pprint(fs_meta)
[[ -9.01226521 -96.7989502 -31.38132668] [ -9.62946892 -96.77184296 -31.43518829]] [[0 1 3] [4 3 1]] OrderedDict([('head', array([ 2, 0, 20], dtype=int32)), ('valid', '1 # volume info valid'), ('filename', '../mri/filled-pretess255.mgz'), ('volume', array([256, 256, 256])), ('voxelsize', array([1., 1., 1.])), ('xras', array([-0.99999994, 0. , 0. ])), ('yras', array([ 0. , 0. , -0.99999994])), ('zras', array([0. , 0.99999994, 0. ])), ('cras', array([-0.99998474, -5.00001526, -1.00003815]))])
NiBabel does not have any viewer for surfaces, but Nilearn has plotting utilities:
_ = nlp.plot_surf((fs_verts, fs_faces))
Let's do the same thing with GIFTI. Here, the file-level metadata is minimal:
gii_pial = nb.load(data_dir / 'ds005-preproc/fmriprep/sub-01/anat/sub-01_hemi-L_pial.surf.gii')
pprint(gii_pial.meta.metadata) # .meta maps onto the XML object, its .metadata property exposes a Python dict
{'Date': 'Mon Jan 13 17:14:05 2020', 'UserName': 'root', 'gifticlib-version': 'gifti library version 1.09, 28 June, 2010'}
The vertices and faces are stored as separate data arrays, each with their own metadata. These can be queried by NIfTI intent code:
pointset_darray = gii_pial.get_arrays_from_intent('pointset')[0]
triangle_darray = gii_pial.get_arrays_from_intent('triangle')[0]
pprint(pointset_darray.meta.metadata)
pprint(triangle_darray.meta.metadata)
{'AnatomicalStructurePrimary': 'CortexLeft', 'AnatomicalStructureSecondary': 'Pial', 'GeometricType': 'Anatomical', 'Name': '/out/freesurfer/sub-01/surf/lh.pial', 'SurfaceCenterX': '-32.284454', 'SurfaceCenterY': '-22.393669', 'SurfaceCenterZ': '-2.683327', 'VolGeomC_A': '0.000000', 'VolGeomC_R': '0.000000', 'VolGeomC_S': '0.000000', 'VolGeomDepth': '256', 'VolGeomHeight': '256', 'VolGeomWidth': '256', 'VolGeomX_A': '0.000000', 'VolGeomX_R': '-1.000000', 'VolGeomX_S': '0.000000', 'VolGeomXsize': '1.000000', 'VolGeomY_A': '0.000000', 'VolGeomY_R': '0.000000', 'VolGeomY_S': '-1.000000', 'VolGeomYsize': '1.000000', 'VolGeomZ_A': '1.000000', 'VolGeomZ_R': '0.000000', 'VolGeomZ_S': '0.000000', 'VolGeomZsize': '1.000000'} {'Name': '/out/freesurfer/sub-01/surf/lh.pial', 'TopologicalType': 'Closed'}
The actual values are stored as the .data
attribute:
gii_verts, gii_faces = pointset_darray.data, triangle_darray.data
print(gii_verts[:2])
print(gii_faces[:2])
[[ -10.000125 -101.81076 -32.395775] [ -10.617342 -101.78375 -32.44954 ]] [[0 1 3] [4 3 1]]
This API can be cumbersome, if all you want is the NumPy arrays. Thanks to a project started at Neurohackademy 2019, the agg_data()
(aggregate data) method will find the requested data array(s) and return just the data:
gii_verts, gii_faces = gii_pial.agg_data(('pointset', 'triangle'))
print(gii_verts[:2])
print(gii_faces[:2])
_ = nlp.plot_surf((gii_verts, gii_faces))
[[ -10.000125 -101.81076 -32.395775] [ -10.617342 -101.78375 -32.44954 ]] [[0 1 3] [4 3 1]]
For data sampled to the surface, FreeSurfer has a few data types:
# Morphometry
curv = nb.freesurfer.read_morph_data(data_dir / 'ds005-preproc/freesurfer/sub-01/surf/lh.curv')
# Annotations
labels, color_table, names = nb.freesurfer.read_annot(data_dir / 'ds005-preproc/freesurfer/sub-01/label/lh.aparc.annot')
# MGH files...
mgh = nb.load(data_dir / 'ds005-preproc/freesurfer/sub-01/surf/lh.w-g.pct.mgh')
print(curv.shape)
print(labels.shape)
print(mgh.shape)
(126142,) (126142,) (126142, 1, 1)
fig, axes = plt.subplots(1, 3, subplot_kw={'projection': '3d'})
_ = nlp.plot_surf((fs_verts, fs_faces), curv, axes=axes[0])
_ = nlp.plot_surf((fs_verts, fs_faces), labels, axes=axes[1])
_ = nlp.plot_surf((fs_verts, fs_faces), mgh.get_fdata(), axes=axes[2])
GIFTIs will be GIFTIs. This one is a BOLD series sampled to the fsaverage5
surface:
bold_gii = nb.load(data_dir / 'ds005-preproc/fmriprep/sub-01/func/sub-01_task-mixedgamblestask_run-1_space-fsaverage5_hemi-L_bold.func.gii')
Each time point is an individual data array with intent NIFTI_INTENT_TIME_SERIES
. agg_data()
will aggregate these into a single array.
data = bold_gii.agg_data('time series')
data.shape
(10242, 240)
We can plot the mean BOLD signal. This time we will use a FreeSurfer surface, which Nilearn knows what to do with:
_ = nlp.plot_surf(str(data_dir / 'ds005-preproc/freesurfer/fsaverage5/surf/lh.inflated'), data.mean(axis=1))
If you didn't follow all of the above, that's okay. Here are the important points:
Hopefully next year, we'll be able to talk about a common interface that will simplify things.
CIFTI-2 is a file format intended to cover many use cases for connectivity analysis.
Files have 2-3 dimensions and each dimension is described by one of 5 types of axis.
For example, a "parcellated dense connectivity" CIFTI-2 file has two dimensions, indexed by a brain models axis and a parcels axis, respectively. The interpretation is "connectivity from parcels to vertices and/or voxels".
On disk, the file is a NIfTI-2 file with an alternative XML header as an extension, schematized here.
NiBabel loads a header that closely mirrors this structure, and makes the NIfTI-2 header accessible as a nifti_header
attribute.
cifti = nb.load(data_dir / 'ds005-preproc/fmriprep/sub-01/func/sub-01_task-mixedgamblestask_run-1_space-fsLR_den-91k_bold.dtseries.nii')
cifti_data = cifti.get_fdata(dtype=np.float32)
cifti_hdr = cifti.header
nifti_hdr = cifti.nifti_header
The Cifti2Header
is useful if you're familiar with the XML structure and need to fetch an exact value or have fine control over the header that is written.
bm0 = next(cifti_hdr.matrix[1].brain_models)
print(bm0.voxel_indices_ijk)
print(list(bm0.vertex_indices)[:20])
None [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
Most of the time, the Axis
format will be more useful:
axes = [cifti_hdr.get_axis(i) for i in range(cifti.ndim)]
axes
[<nibabel.cifti2.cifti2_axes.SeriesAxis at 0x7f0de920c400>, <nibabel.cifti2.cifti2_axes.BrainModelAxis at 0x7f0de920c4c0>]
The simplest way to get a handle on CIFTI-2 data is to use it. Let's take an axis and a data block and repackage the voxels as a regular NIfTI-1 image:
def volume_from_cifti(data, axis):
assert isinstance(axis, nb.cifti2.BrainModelAxis)
data = data.T[axis.volume_mask] # Assume brainmodels axis is last, move it to front
volmask = axis.volume_mask # Which indices on this axis are for voxels?
vox_indices = tuple(axis.voxel[axis.volume_mask].T) # ([x0, x1, ...], [y0, ...], [z0, ...])
vol_data = np.zeros(axis.volume_shape + data.shape[1:], # Volume + any extra dimensions
dtype=data.dtype)
vol_data[vox_indices] = data # "Fancy indexing"
return nb.Nifti1Image(vol_data, axis.affine) # Add affine for spatial interpretation
volume_from_cifti(cifti_data, axes[1]).orthoview()
<OrthoSlicer3D: (91, 109, 91, 240)>
Now we can extract the values on a surface vertex. This time, as a simple numpy array:
def surf_data_from_cifti(data, axis, surf_name):
assert isinstance(axis, nb.cifti2.BrainModelAxis)
for name, data_indices, model in axis.iter_structures(): # Iterates over volumetric and surface structures
if name == surf_name: # Just looking for a surface
data = data.T[data_indices] # Assume brainmodels axis is last, move it to front
vtx_indices = model.vertex # Generally 1-N, except medial wall vertices
surf_data = np.zeros((vtx_indices.max() + 1,) + data.shape[1:], dtype=data.dtype)
surf_data[vtx_indices] = data
return surf_data
raise ValueError(f"No structure named {surf_name}")
_ = nlp.plot_surf(str(data_dir / "conte69/Conte69.L.inflated.32k_fs_LR.surf.gii"),
surf_data_from_cifti(cifti_data, axes[1], 'CIFTI_STRUCTURE_CORTEX_LEFT').mean(axis=1),
cmap='plasma')
Finally, combine into a function that will break a CIFTI-2 matrix into a volume and two surface components:
def decompose_cifti(img):
data = img.get_fdata(dtype=np.float32)
brain_models = img.header.get_axis(1) # Assume we know this
return (volume_from_cifti(data, brain_models),
surf_data_from_cifti(data, brain_models, "CIFTI_STRUCTURE_CORTEX_LEFT"),
surf_data_from_cifti(data, brain_models, "CIFTI_STRUCTURE_CORTEX_RIGHT"))
vol, left, right = decompose_cifti(cifti)
print(vol.shape, left.shape, right.shape)
(91, 109, 91, 240) (32492, 240) (32492, 240)
Diffusion MRI uses estimates of the directional flow of water to detect white matter tracts. These tracts are represented as a series of points in world space.
NiBabel represents tractogram files as a collection of these things:
Files of this type are considered TractogramFile
s. Like SpatialImage
, the goal is to provide a uniform interface where possible.
The two currently-supported file types are TCK (MRtrix) and TRK (TrackVis). We will load some very small ones that were created for testing:
tck = nb.streamlines.load(Path(nb.testing.data_path) / 'standard.tck')
trk = nb.streamlines.load(Path(nb.testing.data_path) / 'complex.trk')
Like a spatial image, printing the tractogram file gives an overview:
print(trk)
MAGIC NUMBER: TRACK v.2 dim: [1 1 1] voxel_sizes: [1. 1. 1.] origin: [0. 0. 0.] nb_scalars: 4 scalar_names: colors 3 fa nb_properties: 5 property_names: mean_colors 3 mean_curvature mean_torsion vox_to_world: [[1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]] voxel_order: RAS image_orientation_patient: [0. 0. 0. 0. 0. 0.] pad1: pad2: invert_x: invert_y: invert_z: swap_xy: swap_yz: swap_zx: n_count: 3 hdr_size: 1000
The overall API is as follows:
# File-level
tractogram = trk.tractogram
trk_affine = trk.affine # Describes relation between voxels and coordinates in an associated image
trk_header = trk.header
# All streamlines
streamlines = tractogram.streamlines
data_per_point = tractogram.data_per_point
data_per_streamline = tractogram.data_per_streamline
# One streamline at a time
tractogram_item = tractogram[2] # Most to look at
streamline_coords = tractogram_item.streamline
streamline_data = tractogram_item.data_for_streamline
point_data = tractogram_item.data_for_points
When looking at an individual streamline, the objects are familiar: NumPy arrays and Python dictionaries.
print(streamline_coords)
pprint(streamline_data)
pprint(point_data)
[[ 0. 1. 2.] [ 3. 4. 5.] [ 6. 7. 8.] [ 9. 10. 11.] [12. 13. 14.]] {'mean_colors': array([0., 0., 1.], dtype=float32), 'mean_curvature': array([3.11], dtype=float32), 'mean_torsion': array([3.22], dtype=float32)} {'colors': array([[0., 0., 1.], [0., 0., 1.], [0., 0., 1.], [0., 0., 1.], [0., 0., 1.]], dtype=float32), 'fa': array([[0.5], [0.6], [0.6], [0.7], [0.8]], dtype=float32)}
When looking at all streamlines, the objects are custom. These deal with issues arising from the fact that each streamline may have a different number of coordinates, so N-dimensional arrays are not appropriate.
print(streamlines)
print(data_per_streamline)
print(data_per_point)
ArraySequence([array([[0., 1., 2.]], dtype=float32), array([[0., 1., 2.], [3., 4., 5.]], dtype=float32), array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.], [ 9., 10., 11.], [12., 13., 14.]], dtype=float32)]) <nibabel.streamlines.tractogram.PerArrayDict object at 0x7f0df445b430> <nibabel.streamlines.tractogram.PerArraySequenceDict object at 0x7f0df445b370>
The dictionaries operate like normal dictionaries, providing NumPy array and ArraySequence
values.
print(data_per_streamline['mean_curvature'])
print(data_per_point['fa'])
[[1.11] [2.11] [3.11]] ArraySequence([array([[0.2]], dtype=float32), array([[0.3], [0.4]], dtype=float32), array([[0.5], [0.6], [0.6], [0.7], [0.8]], dtype=float32)])
Each component is also indexable by integer:
print(streamlines[2])
pprint(dict(data_per_streamline[2])); print() # For space
pprint(dict(data_per_point[2]))
[[ 0. 1. 2.] [ 3. 4. 5.] [ 6. 7. 8.] [ 9. 10. 11.] [12. 13. 14.]] {'mean_colors': array([[0.], [0.], [1.]], dtype=float32), 'mean_curvature': array([[3.11]], dtype=float32), 'mean_torsion': array([[3.22]], dtype=float32)} {'colors': ArraySequence([array([0., 0., 1.], dtype=float32), array([0., 0., 1.], dtype=float32), array([0., 0., 1.], dtype=float32), array([0., 0., 1.], dtype=float32), array([0., 0., 1.], dtype=float32)]), 'fa': ArraySequence([array([0.5], dtype=float32), array([0.6], dtype=float32), array([0.6], dtype=float32), array([0.7], dtype=float32), array([0.8], dtype=float32)])}
As we can see, there are a few ways to get at the same data, based on whether you prefer to operate by streamline or by data element (such as fa
).
You made it!
nb.load
, nb.save
, img.to_filename()
, img.to_bytes()
, img.from_bytes()
SpatialImage
API and identify its componentsimg.get_fdata()
/img.dataobj
img.affine
img.header
get_fdata()
and dataobj
provide tradeoffs between consistency and control