The notebook shows how to load an IDR image converted into a Zarr file with labels.
The image is referenced in the paper "NesSys: a novel method for accurate nuclear segmentation in 3D" published August 2019 in PLOS Biology: https://doi.org/10.1371/journal.pbio.3000388 and can be viewed online in the Image Data Resource.
This original image was converted into the Zarr format. The analysis results produced by the authors of the paper were converted into labels and linked to the Zarr file which was placed into a public S3 repository.
In this notebook, the Zarr file is then loaded together with the labels from the S3 storage and analyzed using Cellpose. The Cellpose analysis produces a segmentation, which is then viewed side-by-side with the original segmentations produced by the authors of the paper obtained via the loaded labels.
The cell below will install dependencies if you choose to run the notebook in Google Colab.
# Package to access data on S3
%pip install aiohttp==3.9.5 zarr==2.17.2
# Package required to interact with Cellpose
%pip install cellpose==3.0.7
The Zarr data is stored separately from the IDR, on an S3 object store.
The identifier is used to name the Zarr file.
image_id = 6001247
The method below will return a dask array without any binary data. The dimension order of the array returned is (TCZYX). Data will be loaded when requested later.
ENPOINT_URL = 'https://uk1s3.embassy.ebi.ac.uk/'
import dask.array as da
def load_binary_from_s3(name, resolution='0'):
root = '%s/%s/' % (name, resolution)
return da.from_zarr(ENPOINT_URL + root)
%%time
name = 'idr/zarr/v0.1/%s.zarr' % (image_id)
data = load_binary_from_s3(name)
CPU times: user 136 ms, sys: 36.7 ms, total: 173 ms Wall time: 1.64 s
We use an existing trained model from Cellpose. The cytoplasm model in cellpose is trained on two-channel images, where the first channel is the channel to segment, and the second channel is an optional nuclear channel. Please check Cellpose documentation and examples to load your own model.
from cellpose import models
model = models.Cellpose(gpu=False, model_type='cyto')
import dask
def analyze(z):
t = 0
channels = [[0, 1]]
model = models.Cellpose(gpu=False, model_type='cyto')
cellpose_masks, flows, styles, diams = model.eval(data[t, :, z, :, :], diameter=None, channels=channels)
return cellpose_masks, z
We use dask.delayed to analyse a few Z-sections around the middle z-section. This very quick since we build the task graph and do not perform the analysis at this stage
%%time
def build_task_graph(range_z):
lazy_results = []
middle_z = data.shape[2] // 2
for z in range(middle_z - range_z, middle_z + range_z):
lazy_result = dask.delayed(analyze)(z)
lazy_results.append(lazy_result)
return lazy_results
CPU times: user 7 µs, sys: 1 µs, total: 8 µs Wall time: 18.8 µs
%%time
# Build the task graph
lazy_results = build_task_graph(2)
print(lazy_results)
%%time
# Analyse the data in parallel
results = dask.compute(*lazy_results)
CPU times: user 43.8 s, sys: 9.46 s, total: 53.2 s Wall time: 8.96 s
import matplotlib.pyplot as plt
%matplotlib inline
from ipywidgets import *
def display_results(i=0):
r, z = results[i]
fig = plt.figure(figsize=(10, 10))
plt.subplot(121)
plt.imshow(r)
plt.title("z: %s" % z)
fig.canvas.flush_events()
interact(display_results, i= widgets.IntSlider(value=0, min=0, max=len(results)-1, step=1, description="Select Plane", continuous_update=False))
interactive(children=(IntSlider(value=0, continuous_update=False, description='Select Plane', max=3), Output()…
<function __main__.display_results(i=0)>
On the right, the labels loaded from S3 representing the original analysis by the authors of the paper. On the left, the masks from Cellpose.
Load the labels from S3. Labels are stored alongside the binary data.
%%time
name = 'idr/zarr/v0.1/%s.zarr/labels' % image_id
labels = load_binary_from_s3(name)
CPU times: user 17.9 ms, sys: 12.1 ms, total: 30 ms Wall time: 168 ms
print(labels.shape)
(1, 1, 257, 210, 253)
def display(i=0):
r, z = results[i]
fig = plt.figure(figsize=(10, 10))
plt.subplot(121)
plt.imshow(r)
plt.title("Cellpose z: %s" % z)
plt.subplot(122)
plt.imshow(labels[0, 0, z, :, :])
plt.title("Original z: %s" % z)
fig.canvas.flush_events()
interact(display, i= widgets.IntSlider(value=0, min=0, max=len(results)-1, step=1, description="Select Plane", continuous_update=False))
interactive(children=(IntSlider(value=0, continuous_update=False, description='Select Plane', max=3), Output()…
<function __main__.display(i=0)>
Copyright (C) 2022-2024 University of Dundee. All Rights Reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.