# -*- coding: utf-8 -*-
# Copyright 2019 - 2024 United Kingdom Research and Innovation
# Copyright 2019 - 2022 The University of Manchester
# Copyright 2019 - 2024 Technical University of Denmark
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Authored by: Jakob S. Jørgensen (DTU)
# Gemma Fardell (UKRI-STFC)
# Margaret Duff (UKRI-STFC)
# Hannah Robarts (UKRI-STFC)
This exercise introduces Tikhonov regularisation and explains how this is implemented in the CIL framework using the so-called block framework.
In a previous exercise, it was seen how CGLS could be used to determine a reconstruction based on the least squares reconstruction problem. It was seen that in case of noisy data, the least squares solution obtained by running until convergence is not desirable due to a high amount of noise. The number of iterations was seen to have a regularising effect, with the smooth, low-frequency components of the image recovered in the first iterations, while high-frequency components of the image such as edges were recovered later. Unfortunately, noise also kicks in, and one needs to pick the number of iterations that best balances the sharpness and amount of noise. As such, the regularising effect is implicitly obtained by choosing the number of iterations to run and never actually running until converged to the least squares solution.
Tikhonov regularisation is more explicit in that a regularisation term is added to the least squares fitting term, specifically a squared 2-norm. This problem should now be solved to convergence instead of using the number of iterations as implicit regularising effect. Instead, a parameter, the regularisation parameter, balances the emphasis on fitting the data and enforcing the regularity and must be chosen to provide the best trade-off.
Tikhonov regularisation tends to offer reduction of noise in the reconstruction, at the price of some blurring. This will be seen in what follows.
To set up Tikhonov problems we need to represent block matrices and concatenate data. In CIL we can do this using BlockOperator and BlockDataContainer as demonstrated in the exercise.
Learning objectives:
First, all imports required are carried out. This includes tools from the cil.framework and cil.optimisation modules, as well as test image generation tools in the tomophantom library and standard imports such as numpy.
# CIL core components needed
from cil.framework import ImageGeometry, AcquisitionGeometry, BlockDataContainer
# CIL optimisation algorithms and linear operators
from cil.optimisation.algorithms import CGLS
from cil.optimisation.operators import BlockOperator, GradientOperator, IdentityOperator, FiniteDifferenceOperator
# CIL example synthetic test image
from cil.utilities.dataexample import SHAPES
# CIL display tools
from cil.utilities.display import show2D, show1D, show_geometry
# Forward/backprojector from CIL ASTRA plugin
from cil.plugins.astra import ProjectionOperator
# For shepp-logan test image in CIL tomophantom plugin
from cil.plugins import TomoPhantom as cilTomoPhantom
# Third-party imports
import numpy as np
import matplotlib.pyplot as plt
A 2D parallel beam case will be simulated. We start by creating a test image and will use the classic Shepp-Logan Phantom with 1024x1024 pixels on the square domain $x:[-1,1]$, $y:[-1,1]$. We set up the ImageGeometry
to specify the dimensions and pixel size of the image:
# Set up image geometry
n = 256
ig = ImageGeometry(voxel_num_x=n,
voxel_num_y=n,
voxel_size_x=2/n,
voxel_size_y=2/n)
print(ig)
Using the CIL tomophantom plugin we can create a CIL ImageData
holding the Shepp-Logan image of the desired size:
phantom2D = cilTomoPhantom.get_ImageData(num_model=1, geometry=ig)
show2D(phantom2D)
Next, we specify the acquisition parameters and store them in an AcquisitionGeometry
object. We use a parallel-beam geometry with 180 projections, and a detector with the same of number and size of pixels as the image:
num_angles = 180
ag = AcquisitionGeometry.create_Parallel2D() \
.set_angles(np.linspace(0, 180, num_angles, endpoint=False)) \
.set_panel(n, 2/n)
print(ag)
We illustrate the geometry:
show_geometry(ag)
To simulate a sinogram we set up a ProjectionOperator using GPU-acceleration using the ASTRA plugin:
device = "gpu"
A = ProjectionOperator(ig, ag, device)
The ideal noisefree sinogram is created by forward-projecting the phantom:
sinogram = A.direct(phantom2D)
The generated test image and sinogram are displayed as images:
plots = [phantom2D, sinogram]
titles = ["Ground truth", "sinogram"]
show2D(plots, titles)
Next, Poisson noise will be applied to this noise-free sinogram. The severity of the noise can be adjusted by changing the background_counts variable.
# Incident intensity: lower counts will increase the noise
background_counts = 5000
# Convert the simulated absorption sinogram to transmission values using Lambert-Beer.
# Use as mean for Poisson data generation.
# Convert back to absorption sinogram.
counts = background_counts * np.exp(-sinogram.as_array())
noisy_counts = np.random.poisson(counts)
sino_out = -np.log(noisy_counts/background_counts)
# Create new AcquisitionData object with same geometry and fill with noisy data.
sinogram_noisy = ag.allocate()
sinogram_noisy.fill(sino_out)
The simulated clean and noisy sinograms are displayed side by side as images:
plots = [sinogram, sinogram_noisy]
titles = ["sinogram", "sinogram noisy"]
show2D(plots, titles)
Before describing Tikhonov regularisation, we recall the problem solved by CGLS:
$$\underset{u}{\mathrm{argmin}}\begin{Vmatrix}A u - b\end{Vmatrix}^2_2$$where,
$A$ is the projection operator
$b$ is the acquired data
$u$ is the unknown image to be determined
In the solution provided by CGLS the low frequency components tend to converge faster than the high frequency components. This means we need to control the number of iterations carefully to select the optimal solution.
Set up the CGLS algorithm, including specifying its initial point to start from:
initial = ig.allocate(0)
cgls_simple = CGLS(initial=initial, operator=A, data=sinogram_noisy)
Once set up, we can run the algorithm for a specified number of iterations:
cgls_simple.run(5, verbose=True)
Display the resulting image from CGLS, along with its difference image with the original ground truth image:
plots = [cgls_simple.solution, cgls_simple.solution - phantom2D]
titles = ["CGLS reconstruction","Difference from ground truth" ]
show2D(plots, titles, fix_range=[(-0.2,1.2),(-0.2,0.2)])
Plot central vertical line profile of CGLS and ground truth:
show1D([phantom2D, cgls_simple.solution],
slice_list=[("horizontal_y",int(n/2))],
dataset_labels=["Ground Truth","CGLS"],
line_colours=['black','dodgerblue'],
line_styles=['solid','dashed'])
Exercise 1: Try running fewer and more iterations to see how the image and line profile changes. Try also with noisier data, by specifying a smaller value of background_counts. Remember you can change the number of iterations to run between outputs. Also note that the algorithm will continue from the point it stopped and run more iterations from that point if run
is called again. If you want to run from the beginning, the algorithm needs to be re-initialised. Try to stop the algorithm before the solution starts to diverge. go to section start
Noisy datasets are problematic with an ill-posed problem such as tomographic reconstruction. If we try to solve these using CGLS we end up with an unstable solution. Regularisation adds information in order for us to solve the problem.
We can add a regularisation term to problem solved by CGLS; this gives us the minimisation problem in the following form, which is known as Tikhonov regularisation: $$\underset{u}{\mathrm{argmin}}\begin{Vmatrix}A u - b \end{Vmatrix}^2_2 + \alpha^2\|Lu\|^2_2$$
where,
$A$ is the projection operator
$b$ is the acquired data
$u$ is the unknown image to be solved for
$\alpha$ is the regularisation parameter
$L$ is a regularisation operator
The first term measures the fidelity of the solution to the data. The second term meausures the fidelity to the prior knowledge we have imposed on the system, operator $L$. $\alpha$ controls the trade-off between these terms. $L$ is often chosen to be a smoothing operator like the identity matrix, or a gradient operator constrained to the squared L2-norm.
This can be re-written equivalently in the block matrix form:
$$\underset{u}{\mathrm{argmin}}\begin{Vmatrix}\binom{A}{\alpha L} u - \binom{b}{0}\end{Vmatrix}^2_2$$With the definitions:
$\tilde{A} = \binom{A}{\alpha L}$
$\tilde{b} = \binom{b}{0}$
this can now be recognised as a least squares problem:
$$\underset{u}{\mathrm{argmin}}\begin{Vmatrix}\tilde{A} u - \tilde{b}\end{Vmatrix}^2_2$$and being a least squares problem, it can be solved using CGLS with $\tilde{A}$ as operator and $\tilde{b}$ as data.
We can construct $\tilde{A}$ and $\tilde{b}$ using the BlockFramework in the CIL.
$\tilde{A}$ is a (column) BlockOperator of size 2x1 and can be set up by
BlockOperator(op0,op1)
The right hand side $\tilde{b}$ is a BlockDataContainer and can be set up by
BlockDataContainer(DataContainer0, DataContainer1)
The simplest form of Tikhonov uses the identity matrix as the regularisation operator. We use an identity matrix as our regularisation operator we are penalising on the magnitude of the solution $u$, which will tend to reduce the pixel values of $u$.
L = IdentityOperator(ig)
alpha = 0.1
operator_block = BlockOperator(A, alpha*L)
In the formulation of Tikhonov as a least squares problem, we need to set up the right hand side vector $\tilde{b}$ holding both the $b$ and a zero-filled ImageData
of the right size, matching the range of the regularising operator. The operator allows us to query the geometry of its range and allocate a zero-filled ImageData
of that geometry. We combine both into a BlockDataContainer
:
zero_data = L.range.allocate(0)
data_block = BlockDataContainer(sinogram_noisy, zero_data)
Run CGLS as before, but passing the BlockOperator and BlockDataContainer
#setup CGLS with the Block Operator and Block DataContainer
initial = ig.allocate(0)
cgls_tikh = CGLS(initial=initial, operator=operator_block, data=data_block, update_objective_interval = 10)
#run the algorithm
cgls_tikh.run(100)
Display results as images and plot central vertical line profile of the Tikhonov with Identity:
plots = [cgls_tikh.solution, cgls_tikh.solution - phantom2D]
titles = ["Tikhonov with Identity regularisation","Difference from ground truth" ]
show2D(plots, titles, fix_range=[(-0.2,1.2),(-0.2,0.2)])
Let's compare the reconstructions from CGLS and Tikhonov with identity regularisation.
plots = [cgls_simple.solution, cgls_tikh.solution]
titles = ["CGLS", "Tikhonov with Identity regularisation" ]
show2D(plots, titles, fix_range=(-0.2,1.2))
show1D([phantom2D, cgls_simple.solution, cgls_tikh.solution],
slice_list=[("horizontal_y", int(n/2))],
dataset_labels=["Ground Truth","CGLS","Tikhonov with Identity regularisation"],
line_colours=['black','dodgerblue','firebrick'],
line_styles=['solid','dashed','dotted'])
Exercise 2: Try running Tikhonov with a range of $\alpha$ values from very small to very large, display reconstruction and line profile and describe the effect of $\alpha$. Find the value of $\alpha$ that gives you the best solution. Then change how much noise you add to the data by going back here: set noise and run through the notebook again. Try with background_counts
set to 5000, 10000 and 1000 remember to find an appropriate value of alpha for each run.
With Tikhonov regularisation the problem should now be solved to convergence instead of using the number of iterations as implicit regularising effect. By increasing the regularisation parameter $\alpha$ we balance the emphasis on fitting the data and enforcing the regularity. A low value of $\alpha$ will give you the CGLS solution, a higher value will reduce the noise in the reconstruction but at the cost of some blurring.
The basic Tikhonov with the identity operator provided perhaps a bit of improvement compared to just CGLS, but there was still a lot of noise in the reconstruction and the pixel values had been reduced. Using the identity as regularising operator means that we penalise pixel values that are non-zero, which may not be what we want. Instead, we want to encourage similar values of neighboring pixels to smooth out the noise. This can be achieved by using the gradient as the smoothing operator.
To do that we will again need to use the BlockFramework, which is now demonstrated in a bit more detail.
A discrete gradient operator (using finite differences) can be constructed using BlockOperators.
The direct gradient operator $\nabla$ acts on an image $u$ and returns a BlockDataContainer $\textbf{w}$, holding finite differences in the $x$ and $y$ directions:
$$ \nabla(u) = \begin{bmatrix} \nabla_x\\ \nabla_y\\ \end{bmatrix} *u = \begin{bmatrix} \nabla_xu\\ \nabla_yu\\ \end{bmatrix} = \begin{bmatrix}w_{x}\\w_{y}\end{bmatrix}= \textbf{w}$$The adjoint gradient operator $\nabla^*$ acts on the BlockDataContainer $\textbf{y}$ and returns an image $\rho$
$$ \nabla^*(\textbf w) = \begin{bmatrix} \nabla^*_x & \nabla^*_y \end{bmatrix} * \begin{bmatrix} w_{x}\\ w_{y}\\ \end{bmatrix} = \begin{bmatrix} \nabla^*_x w_x + \nabla^*_y w_y \end{bmatrix} = \rho$$We load a test image to demonstrate how the gradient operator works:
shapes = SHAPES.get()
show2D(shapes, "shapes")
The finite difference operator can be called from the framework. This returns the difference between each pair of pixels along one direction.
We need to initialise it with the image geometry, the direction of the calculation and the boundary conditions to use.
FiniteDifferenceOperator(gm_domain, direction, bnd_cond='Neumann' or 'Periodic')
#define the operator FiniteDiff - needs to image geometry, the direction and the boundary conditions
fdx = FiniteDifferenceOperator(shapes.geometry, direction='horizontal_x', bnd_cond='Neumann')
#run it over the input image
image_2D_dx = fdx.direct(shapes)
#plot ths results
show2D(image_2D_dx, "dx")
Note how all vertical edges have been picked up (and their sign) applying this operator doing finite differences in the horizontal direction.
To set up a gradient in both $x$ and $y$ directions, we can create a BlockOperator to contain a finite difference operator for each of the $x$ and $y$ directions. We can apply it (using its direct
method) to the test image and visualise the result.
# Define the x and y operators
fdx = FiniteDifferenceOperator(shapes.geometry, direction='horizontal_x', bnd_cond='Neumann')
fdy = FiniteDifferenceOperator(shapes.geometry, direction='horizontal_y', bnd_cond='Neumann')
# Construct the BlockOperator combining the two operators
FD = BlockOperator(fdx, fdy)
#run it on the test image
fd_out = FD.direct(shapes)
Display output:
plots = [fd_out.get_item(0), fd_out.get_item(1)]
titles = ["dx", "dy"]
show2D(plots, titles, fix_range=(-1,1))
To see what is going on, we take a closer look at data types.
First, the input is an ImageData
and its shape is a 2-element vector with the number of pixels in each direction:
print(type(shapes))
print(shapes)
The output however is a BlockDataContainer
, essentially a list (with additional functionality) holding two ImageData
elements, one for each direction we have taken finite differences. We can pick out each element of the BlockDataContainer
and see that they indeed are ImageData
and print their shapes (number of pixels in each direction):
#output is BloackDataContainer
print(type(fd_out))
print(fd_out.shape)
print("\tDataContainer 0")
print(type(fd_out.get_item(0)))
print(fd_out.get_item(0))
print("\tDataContainer 1")
print(type(fd_out.get_item(1)))
print(fd_out.get_item(1))
The BlockFramework provides basic algebra between BlockDataContainers, numpy arrays, lists of numbers, DataContainers, subclasses and scalars providing the shape of the containers are compatible
The BlockOperator
is a special kind of Operator
, and being an Operator
it should have an adjoint method. This is automatically provided from the adjoints of the operators. In the present case our BlockOperator
will take a BlockDataContainer
as input to its adjoint and return an ImageData
, as visualised below:
# Run the adjoint method
adjoint_output = FD.adjoint(fd_out)
show2D(adjoint_output, "adjoint gradient")
BlockDataContainer holds datacontainers as a column vector
$$\textbf{x} = \begin{bmatrix}x_{1}\\ x_{2}\end{bmatrix}$$$$\textbf{y} = \begin{bmatrix}y_{1}\\ y_{2} \\ y_{3}\end{bmatrix}$$BlockOperator is a matrix of operators.
$$ K = \begin{bmatrix} A_{1} & A_{2} \\ A_{3} & A_{4} \\ A_{5} & A_{6} \end{bmatrix}_{(3,2)} * \quad \underbrace{\begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix}_{(2,1)}}_{\textbf{x}} = \begin{bmatrix} A_{1}x_{1} + A_{2}x_{2}\\ A_{3}x_{1} + A_{4}x_{2}\\ A_{5}x_{1} + A_{6}x_{2}\\ \end{bmatrix}_{(3,1)} = \begin{bmatrix} y_{1}\\ y_{2}\\ y_{3} \end{bmatrix}_{(3,1)} = \textbf{y}$$Column: Share the same domains $X_{1}, X_{2}$
Rows: Share the same ranges $Y_{1}, Y_{2}, Y_{3}$
Now we go back to our Tikhonov reconstruction, this time use the gradient operator in the regulariser.
$${\mathrm{argmin}}\begin{Vmatrix}\binom{A}{\alpha \nabla} u - \binom{b}{0}\end{Vmatrix}^2_2$$With the definitions:
$\tilde{A} = \binom{A}{\alpha \nabla}$
$\tilde{b} = \binom{b}{0}$
And solve using CGLS: $$\underset{u}{\mathrm{argmin}}\begin{Vmatrix}\tilde{A} u - \tilde{b}\end{Vmatrix}^2_2$$
We'll use the framework's Gradient()
operator - this is an optimised form of FD over the space dimensions (or even or space+channels in case of multiple channels).
Exercise 3: Set up the BlockOperator $\tilde{A}$ and the BlockDataContainer $\tilde{b}$ as before but with the Gradient operator. Outline code to be completed is given in the next two code cells. Once set up, run the following cells to execute CGLS with these as input. Run Tikhonov reconstruction using gradient regularisation. Try a range of $\alpha$ values ranging from very small to very large, visualise the resulting image and central line profiles, and describe the effect of the regularisation parameter choice. Find the $\alpha$ that (visually) gives you the best solution.
L = GradientOperator(ig)
alpha = 0.01
#operator_block = BlockOperator( ... )
#define the data b
#data_block = BlockDataContainer( ... )
Uncomment the following line and run the cell to see the solution, to run the lines you'll need to run the cell a second time
# %load ./snippets/02_ex3.py
#setup CGLS with the block operator and block data
initial = ig.allocate(0)
cgls_tikh_g = CGLS(initial=initial, operator=operator_block, data=data_block, update_objective_interval = 10)
#run the algorithm
cgls_tikh_g.run(200, verbose=True)
#plot the results
plots = [cgls_tikh_g.solution, cgls_tikh_g.solution - phantom2D]
titles = ["Tikhonov with gradient regularisation", "Difference from ground truth" ]
show2D(plots, titles, fix_range=[(-0.2,1.2),(-0.2,0.2)])
Central vertical line profiles of ground truth and Tikhonov with Gradient operator:
show1D([phantom2D, cgls_tikh_g.solution],
slice_list=[("horizontal_y",int(n/2))],
dataset_labels=["Ground Truth","Tikhonov with Gradient regularisation"],
line_colours=['black','purple'],
line_styles=['solid','solid'])
To wrap up we compare the reconstructions produced by all reconstruction methods considered in this notebook: Simple CGLS, Tikhonov with Identity regularisation and Tikhonov with Gradient regularisation, along with the ground truth image. We display images and central vertical line profiles:
plots = [phantom2D, cgls_simple.solution, cgls_tikh.solution, cgls_tikh_g.solution]
titles = ["Ground truth", "CGLS simple", "Tikhonov with Identity regularisation", "Tikhonov with gradient regularisation" ]
show2D(plots, titles, fix_range=(-0.2,1.2), num_cols=4)
show1D([phantom2D, cgls_simple.solution, cgls_tikh.solution, cgls_tikh_g.solution],
slice_list=[("horizontal_y",int(n/2))],
dataset_labels=["Ground Truth","CGLS","Tikhonov with Identity regularisation", "Tikhonov with Gradient regularisation"],
line_colours=['black','dodgerblue','firebrick','purple'],
line_styles=['solid','dashed','dotted','solid'])
As can be seen from images and line profiles, Tikhonov with Gradient regularisation allows us to reduce the noise in the reconstruction substantially. However, we may pay a price in terms of blurring the edges.
Learning objectives:
After having worked through this notebook, we have now seen how to:
This completes the present exercise on Tikhonov regularisation with the CGLS algorithm. In other exercises we will see how to use the CIL framework for real data reconstruction and how to do regularisation based on non-smooth optimisation, to help preserve edges better, while reducing the noise.