Prepare to conquer the Nth dimension! To begin the course, you'll learn how to load, build and navigate N-dimensional images using a CT image of the human chest. You'll also leverage the useful ImageIO package and hone your NumPy and matplotlib skills. This is the Summary of lecture "Biomedical Image Analysis in Python", via datacamp.
import numpy as np
import scipy
import imageio
import matplotlib.pyplot as plt
from pprint import pprint
plt.rcParams['figure.figsize'] = (10, 8)
In this chapter, we'll work with sections of a computed tomography (CT) scan from The Cancer Imaging Archive. CT uses a rotating X-ray tube to create a 3D image of the target area.
The actual content of the image depends on the instrument used: photographs measure visible light, x-ray and CT measure radiation absorbance, and MRI scanners measure magnetic fields.
To warm up, use the imageio
package to load a single DICOM image from the scan volume and check out a few of its attributes.
# Load "chest-220.dcm"
im = imageio.imread('./dataset/tcia-chest-ct-sample/chest-220.dcm')
# Print image attributes
print('Image type:', type(im))
print('Shape of image array:', im.shape)
Image type: <class 'imageio.core.util.Array'> Shape of image array: (512, 512)
imageio
is a versatile package. It can read in a variety of image data, including JPEG, PNG, and TIFF. But it's especially useful for its ability to handle DICOM files.
ImageIO reads in data as Image
objects. These are standard NumPy arrays with a dictionary of metadata.
Metadata can be quite rich in medical images and can include:
Start this exercise by reading in the chest image and listing the available fields in the meta dictionary.
# Print the available metadata fields
pprint(im.meta)
Dict([('TransferSyntaxUID', '1.2.840.10008.1.2'), ('SOPClassUID', '1.2.840.10008.5.1.4.1.1.2'), ('SOPInstanceUID', '1.3.6.1.4.1.14519.5.2.1.5168.1900.290866807370146801046392918286'), ('StudyDate', '20040529'), ('SeriesDate', '20040515'), ('ContentDate', '20040515'), ('StudyTime', '115208'), ('SeriesTime', '115254'), ('ContentTime', '115325'), ('Modality', 'CT'), ('Manufacturer', 'GE MEDICAL SYSTEMS'), ('StudyDescription', 'PET CT with registered MR'), ('SeriesDescription', 'CT IMAGES - RESEARCH'), ('PatientName', 'STS_007'), ('PatientID', 'STS_007'), ('PatientBirthDate', ''), ('PatientSex', 'F '), ('PatientWeight', 82.0), ('StudyInstanceUID', '1.3.6.1.4.1.14519.5.2.1.5168.1900.381397737790414481604846607090'), ('SeriesInstanceUID', '1.3.6.1.4.1.14519.5.2.1.5168.1900.315477836840324582280843038439'), ('SeriesNumber', 2), ('AcquisitionNumber', 1), ('InstanceNumber', 57), ('ImagePositionPatient', (-250.0, -250.0, -180.62)), ('ImageOrientationPatient', (1.0, 0.0, 0.0, 0.0, 1.0, 0.0)), ('SamplesPerPixel', 1), ('Rows', 512), ('Columns', 512), ('PixelSpacing', (0.976562, 0.976562)), ('BitsAllocated', 16), ('BitsStored', 16), ('HighBit', 15), ('PixelRepresentation', 0), ('RescaleIntercept', -1024.0), ('RescaleSlope', 1.0), ('PixelData', b'Data converted to numpy array, raw data removed to preserve memory'), ('shape', (512, 512)), ('sampling', (0.976562, 0.976562))])
Perhaps the most critical principle of image analysis is: look at your images!
Matplotlib's imshow()
function gives you a simple way to do this. Knowing a few simple arguments will help:
cmap
controls the color mappings for each value. The "gray" colormap is common, but many others are available.vmin
and vmax
control the color contrast between values. Changing these can reduce the influence of extreme values.plt.axis('off')
removes axis and tick labels from the image.For this exercise, plot the CT scan and investigate the effect of a few different parameters.
fig, ax = plt.subplots(1, 3, figsize=(15, 10))
# Draw the image in grayscale
ax[0].imshow(im, cmap='gray');
# Draw the image with greater contrast
ax[1].imshow(im, cmap='gray', vmin=-200, vmax=200);
# Remove axis ticks and labels
ax[2].imshow(im, cmap='gray', vmin=-200, vmax=200);
ax[2].axis('off');
Image "stacks" are a useful metaphor for understanding multi-dimensional data. Each higher dimension is a stack of lower dimensional arrays.
In this exercise, we will use NumPy's stack()
function to combine several 2D arrays into a 3D volume. By convention, volumetric data should be stacked along the first dimension: vol[plane, row, col]
.
Note: performing any operations on an ImageIO Image
object will convert it to a numpy.ndarray
, stripping its metadata.
# Read in each 2D image
im1 = imageio.imread('./dataset/tcia-chest-ct-sample/chest-220.dcm')
im2 = imageio.imread('./dataset/tcia-chest-ct-sample/chest-221.dcm')
im3 = imageio.imread('./dataset/tcia-chest-ct-sample/chest-222.dcm')
# Stack images into a volume
vol = np.stack([im1, im2, im3], axis=0)
print('Volume dimensions:', vol.shape)
Volume dimensions: (3, 512, 512)
ImageIO's volread()
function can load multi-dimensional datasets and create 3D volumes from a folder of images. It can also aggregate metadata across these multiple images.
For this exercise, read in an entire volume of brain data from the "./dataset/tcia-chest-ct-sample" folder, which contains 5 DICOM images.
# Load the "tcia-chest-ct" directory
vol = imageio.volread('./dataset/tcia-chest-ct-sample/')
# Print image attributes
print('Available metadata:', vol.meta.keys())
print('Shape of image array:', vol.shape)
Reading DICOM (examining files): 1/5 files (20.0%5/5 files (100.0%) Found 1 correct series. Reading DICOM (loading data): 5/5 (100.0%) Available metadata: odict_keys(['TransferSyntaxUID', 'SOPClassUID', 'SOPInstanceUID', 'StudyDate', 'SeriesDate', 'ContentDate', 'StudyTime', 'SeriesTime', 'ContentTime', 'Modality', 'Manufacturer', 'StudyDescription', 'SeriesDescription', 'PatientName', 'PatientID', 'PatientBirthDate', 'PatientSex', 'PatientWeight', 'StudyInstanceUID', 'SeriesInstanceUID', 'SeriesNumber', 'AcquisitionNumber', 'InstanceNumber', 'ImagePositionPatient', 'ImageOrientationPatient', 'SamplesPerPixel', 'Rows', 'Columns', 'PixelSpacing', 'BitsAllocated', 'BitsStored', 'HighBit', 'PixelRepresentation', 'RescaleIntercept', 'RescaleSlope', 'PixelData', 'shape', 'sampling']) Shape of image array: (5, 512, 512)
The amount of physical space covered by an image is its field of view, which is calculated from two properties:
shape
attribute.meta['sampling']
).You can draw multiple images in one figure to explore data quickly. Use plt.subplots()
to generate an array of subplots.
fig, axes = plt.subplots(nrows=2, ncols=2)
To draw an image on a subplot, call the plotting method directly from the subplot object rather than through PyPlot: axes[0,0].imshow(im)
rather than plt.imshow(im)
.
# Initialize figure and axes grid
fig, axes = plt.subplots(nrows=1, ncols=2)
# Draw an images on each subplot
axes[0].imshow(im1, cmap='gray');
axes[1].imshow(im2, cmap='gray');
# Remove ticks/labels and render
axes[0].axis('off');
axes[1].axis('off');
The simplest way to plot 3D and 4D images by slicing them into many 2D frames. Plotting many slices sequentially can create a "fly-through" effect that helps you understand the image as a whole.
To select a 2D frame, pick a frame for the first axis and select all data from the remaining two: vol[0, :, :]
# Plot the images on a subplots array
fig, axes = plt.subplots(1, 5, figsize=(15, 10))
# Loop through subplots and draw image
for ii in range(5):
im = vol[ii, :, :]
axes[ii].imshow(im, cmap='gray', vmin=-200, vmax=200)
axes[ii].axis('off')
Any two dimensions of an array can form an image, and slicing along different axes can provide a useful perspective. However, unequal sampling rates can create distorted images.
Changing the aspect ratio can address this by increasing the width of one of the dimensions.
For this exercise, plot images that slice along the second and third dimensions of vol. Explicitly set the aspect ratio to generate undistorted images.
# Select frame frol "vol"
im1 = vol[:, 256, :]
im2 = vol[:, :, 256]
# Compute aspect ratios
d0, d1, d2 = vol.meta['sampling']
asp1 = d0 / d2
asp2 = d0 / d1
# Plot the images on a subplots array
fig, axes = plt.subplots(2, 1, figsize=(15, 8))
axes[0].imshow(im1, cmap='gray', aspect=asp1);
axes[1].imshow(im2, cmap='gray', aspect=asp2);
Note: Sample dataset doesn't contain enough amount to plot whole images.