Important: Please read the installation page for details about how to install the toolboxes. $\newcommand{\dotp}[2]{\langle #1, #2 \rangle}$ $\newcommand{\enscond}[2]{\lbrace #1, #2 \rbrace}$ $\newcommand{\pd}[2]{ \frac{ \partial #1}{\partial #2} }$ $\newcommand{\umin}[1]{\underset{#1}{\min}\;}$ $\newcommand{\umax}[1]{\underset{#1}{\max}\;}$ $\newcommand{\umin}[1]{\underset{#1}{\min}\;}$ $\newcommand{\uargmin}[1]{\underset{#1}{argmin}\;}$ $\newcommand{\norm}[1]{\|#1\|}$ $\newcommand{\abs}[1]{\left|#1\right|}$ $\newcommand{\choice}[1]{ \left\{ \begin{array}{l} #1 \end{array} \right. }$ $\newcommand{\pa}[1]{\left(#1\right)}$ $\newcommand{\diag}[1]{{diag}\left( #1 \right)}$ $\newcommand{\qandq}{\quad\text{and}\quad}$ $\newcommand{\qwhereq}{\quad\text{where}\quad}$ $\newcommand{\qifq}{ \quad \text{if} \quad }$ $\newcommand{\qarrq}{ \quad \Longrightarrow \quad }$ $\newcommand{\ZZ}{\mathbb{Z}}$ $\newcommand{\CC}{\mathbb{C}}$ $\newcommand{\RR}{\mathbb{R}}$ $\newcommand{\EE}{\mathbb{E}}$ $\newcommand{\Zz}{\mathcal{Z}}$ $\newcommand{\Ww}{\mathcal{W}}$ $\newcommand{\Vv}{\mathcal{V}}$ $\newcommand{\Nn}{\mathcal{N}}$ $\newcommand{\NN}{\mathcal{N}}$ $\newcommand{\Hh}{\mathcal{H}}$ $\newcommand{\Bb}{\mathcal{B}}$ $\newcommand{\Ee}{\mathcal{E}}$ $\newcommand{\Cc}{\mathcal{C}}$ $\newcommand{\Gg}{\mathcal{G}}$ $\newcommand{\Ss}{\mathcal{S}}$ $\newcommand{\Pp}{\mathcal{P}}$ $\newcommand{\Ff}{\mathcal{F}}$ $\newcommand{\Xx}{\mathcal{X}}$ $\newcommand{\Mm}{\mathcal{M}}$ $\newcommand{\Ii}{\mathcal{I}}$ $\newcommand{\Dd}{\mathcal{D}}$ $\newcommand{\Ll}{\mathcal{L}}$ $\newcommand{\Tt}{\mathcal{T}}$ $\newcommand{\si}{\sigma}$ $\newcommand{\al}{\alpha}$ $\newcommand{\la}{\lambda}$ $\newcommand{\ga}{\gamma}$ $\newcommand{\Ga}{\Gamma}$ $\newcommand{\La}{\Lambda}$ $\newcommand{\si}{\sigma}$ $\newcommand{\Si}{\Sigma}$ $\newcommand{\be}{\beta}$ $\newcommand{\de}{\delta}$ $\newcommand{\De}{\Delta}$ $\newcommand{\phi}{\varphi}$ $\newcommand{\th}{\theta}$ $\newcommand{\om}{\omega}$ $\newcommand{\Om}{\Omega}$
This numerical tour uses several orthogonal bases to perform non-linear image approximation.
from __future__ import division
import numpy as np
import scipy as scp
import pylab as pyl
import matplotlib.pyplot as plt
from nt_toolbox.general import *
from nt_toolbox.signal import *
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
%load_ext autoreload
%autoreload 2
This tours makes use of an orthogonal base $ \Bb = \{ \psi_m \}_{m=0}^{N-1} $ of the space $\RR^N$ of the images with $N$ pixels.
The best $M$-term approximation of $f$ is obtained by a non-linear thresholding
$$ f_M = \sum_{ \abs{\dotp{f}{\psi_m}}>T } \dotp{f}{\psi_m} \psi_m, $$where the value of $T>0$ should be carefully selected so that only $M$ coefficients are not thresholded, i.e.
$$ \abs{ \enscond{m}{ \abs{\dotp{f}{\psi_m}}>T } } = M. $$The goal is to use an ortho-basis $ \Bb $ so that the error $ \norm{f-f_M} $ decays as fast as possible when $M$ increases, for a large class of images.
This tour studies several different orthogonal bases: Fourier, wavelets (which is at the heart of JPEG-2000), cosine, local cosine (which is at the heart of JPEG).
First we load an image of $ N = n \times n $ pixels.
n = 512
f = rescale(load_image("nt_toolbox/data/hibiscus.bmp", n))
Display it.
plt.figure(figsize = (5,5))
imageplot(f)
The discrete 2-D Fourier atoms are defined as: $$ \psi_m(x) = \frac{1}{\sqrt{N}} e^{ \frac{2i\pi}{n} ( x_1 m_1 + x_2 m_2 ) }, $$ where $ 0 \leq m_1,m_2 < n $ indexes the frequency.
The set of inner products $ \{ \dotp{f}{\psi_m} \}_m $ is computed in $O(N \log(N))$ operations with the 2-D Fast Fourier Transform (FFT) algorithm (the pylab function is fft2).
Compute the Fourier transform using the FFT algorithm. Note the normalization by $1/\sqrt{N}$ to ensure orthogonality (energy conservation) of the transform.
fF = pyl.fft2(f)/n
Display its magnitude (in log scale). We use the pylab function fftshift to put the low frequency in the center.
plt.figure(figsize = (5,5))
imageplot(np.log(1e-5 + abs(pyl.fftshift(fF))))
An image is recovered from a set of coefficients $c_m$ using the inverse Fourier Transform (pylab function ifft2)) that implements the formula
$$ f_M = \sum_m c_m \psi_m. $$Perform a thresholding.
T = .3
c = np.multiply(fF,(abs(fF) > T))
Inverse the Fourier transform.
fM = np.real(pyl.ifft2(c)*n)
Display the approximation.
plt.figure(figsize = (5,5))
imageplot(clamp(fM))
Exercise 1
Compute a best $M$-term approximation in the Fourier basis of $f$, for $M \in \{N/100, N/20\}$. Compute the approximation using a well chosen hard threshold value $T$.
run -i nt_solutions/coding_1_approximation/exo1
## Insert your code here.
The best $M$-term approximation error is computed using the conservation of energy as
$$ \epsilon[M]^2 = \norm{f-f_M}^2 = \sum_{ \abs{\dotp{f}{\psi_m}} \leq T } \abs{\dotp{f}{\psi_m}}^2. $$If one denotes by $ \{ c_R[k] \}_{k=0}^{N-1} $ the set of coefficients magnitudes $ \abs{\dotp{f}{\psi_m}} $ ordered by decaying magnitudes, then this error is easily computed as $$ \epsilon[M]^2 = \sum_{k=M}^{N-1} c_R[k]^2 = \norm{f}^2 - \sum_{k=0}^{M-1} c_R[k]^2. $$ This means that $\epsilon^2$ is equal to $\norm{f}^2$ minus the discrete primitive of $ c_R^2 $.
Exercise 2
Compute and display in log scales the ordered coefficients $c_R$. Hint: a discrete primitive can be computed using the numpy function cumsum.
run -i nt_solutions/coding_1_approximation/exo2
## Insert your code here.
Exercise 3
Compute and display in log-scale the non-linear approximation error $\epsilon[M]^2$. Store the values of $\epsilon[M]^2$ in a vector $err\_fft$.
run -i nt_solutions/coding_1_approximation/exo3
## Insert your code here.
The Wavelet basis of continuous 2-D functions is defined by by scaling and translating three mother atoms $ \{\psi^H,\psi^V,\psi^D\} $: $$ \psi_{j,n}^k(x) = \frac{1}{2^j}\psi^k\pa{\frac{x-2^j n}{2^j}} $$
Non-linear wavelet approximation is a the heart of the JPEG-2000 compression standard.
The set of inner products $ \{ \dotp{f}{\psi_m} \}_m $ is computed in $O(N)$ operations with the 2-D Fast Wavelet Transform algorithm.
Perform a wavelet transform. Here we use a daubechies wavelet transform.
from numpy import linalg
from nt_toolbox.compute_wavelet_filter import *
Jmin = 1
h = compute_wavelet_filter("Daubechies",10)
fW = perform_wavortho_transf(f, Jmin, + 1, h)
Display the coefficients.
plt.figure(figsize = (8,8))
plot_wavelet(fW, Jmin)
plt.title('Wavelet coefficients')
plt.show()
Exercise 4
Compute a best $M$-term approximation in the wavelet basis of $f$, for $M \in \{N/100, N/20\}$. Compute the approximation using a well chosen hard threshold value $T$. Note that the inverse wavelet transform is obtained by replacing the +1 by a -1 in the definition of the transform.
run -i nt_solutions/coding_1_approximation/exo4
## Insert your code here.
Exercise 5
Compute and display in log-scale the non-linear approximation error $\epsilon[M]^2$. Compares the Fourier and wavelets approximations. Store the values of $\epsilon[M]^2$ in a vector $err\_wav$.
run -i nt_solutions/coding_1_approximation/exo5
## Insert your code here
The discrete cosine approximation (DCT) is similar to the Fourier approximation, excepted that it used symmetric boundary condition instead of periodic boundary condition, and is thus more useful to approximate image.
A 1-D cosine atom of $n$ sample is defined as $$ \bar\psi_m(x) = \frac{1}{\sqrt{N}} \cos\pa{ \frac{2\pi}{N} (x-1/2) m } $$ A 2-D cosine atom is obtained by tensor product of 1-D atoms $$ \psi_{m_1,m_2}(x_1,x_2) = \bar\psi_{m_1}(x_1) \bar\psi_{m_2}(x_2). $$ On the contrary to the Fourier 2-D atoms, these 2-D DCT atoms are not oriented (they contains 4 Fourier frequencies).
The set of inner products $ \{ \dotp{f}{\psi_m} \}_m $ is computed in $O(N \log(N))$ operations with the 2-D Fast Cosine Transform algorithm.
from scipy import fftpack
def dct2(f):
return np.transpose(fftpack.dct(np.transpose(fftpack.dct(f, norm = "ortho")), norm = "ortho"))
def idct2(f):
return np.transpose(fftpack.idct(np.transpose(fftpack.idct(f, norm = "ortho")), norm = "ortho"))
fC = dct2(f)
Display the magnitude of the DCT coefficients. Note that the low frequencies are in the upper-left corner.
plt.figure(figsize = (5,5))
imageplot(np.log(1e-5 + abs(fC)))
Exercise 6
Compute a best $M$-term approximation in the wavelet basis of $f$, for $M \in \{N/100, N/20\}$. Compute the approximation using a well chosen hard threshold value $T$. Note that the inverse DCT transform is obtained with the function idct.
run -i nt_solutions/coding_1_approximation/exo6
## Insert your code here.
Exercise 7
Compute and display in log-scale the non-linear approximation error $\epsilon[M]^2$. Compares the Fourier and DCT approximations. Store the values of $\epsilon[M]^2$ in a vector $err\_dct$.
run -i nt_solutions/coding_1_approximation/exo7
## Insert your code here.
To improve the global DCT approximation, one can approximate independantly small patches in the image. This corresponds to a decomposition in a local cosine basis, which is at the heart of the JPEG image compression standard.
The only parameter of the transform is the size of the square.
w = 16
Initialize at zero the transformed image in the local DCT basis.
fL = np.zeros([n, n])
Example of patch index.
i = 5
j = 7
For a given path index $(i,j)$, we extract a $(w,w)$ patch.
P = f[(i-1)*w: i*w, (j-1)*w: j*w]
Compute the Cosine transform of the patch using the fast DCT algorithm.
fL[(i-1)*w: i*w, (j-1)*w: j*w] = dct2(P)
Display the patch and its coefficients. We removed the low frequency of $P$ for display purpose only.
plt.figure(figsize = (8,8))
imageplot(P, 'Patch', [1, 2, 1])
imageplot(dct2(P-np.mean(P)), 'DCT', [1, 2, 2])
Exercise 8
Compute the local DCT transform $f_L$ by transforming each patch.
run -i nt_solutions/coding_1_approximation/exo8
## Insert your code here.
Display the coefficients.
plt.figure(figsize = (5,5))
imageplot(np.clip(abs(fL),0,.005*w*w))
Exercise 9
Compute the inverse local DCT transform of the coefficients $f_L$ by inverse transforming each patch using the function idct2.
run -i nt_solutions/coding_1_approximation/exo9
Error |f-f1|/|f| = 2.0643856717379472e-16
## Insert your code here.
Exercise 10
Compute a few best $M$-term approximations in the Local DCT basis of $f$.
run -i nt_solutions/coding_1_approximation/exo10
## Insert your code here.
Exercise 11
Compute and display in log-scale the non-linear approximation error $\epsilon[M]^2$. Store the values of $\epsilon[M]^2$ in a vector |err_ldct|. Compares the Fourier, Wavelets, DCT and local-DCT approximations.
run -i nt_solutions/coding_1_approximation/exo11
## Insert your code here.
An image is more complicated than an other one for a given orthogonal basis if its approximation error decays more slowly.
First load several high resolution images.
n = 512
fList = np.zeros([n,n,4])
fList[: , : , 0] = rescale(load_image("nt_toolbox/data/regular3.bmp", n))
fList[: , : , 1] = rescale(load_image("nt_toolbox/data/phantom.bmp", n))
fList[: , : , 2] = rescale(load_image("nt_toolbox/data/lena.bmp", n))
fList[: , : , 3] = rescale(load_image("nt_toolbox/data/mandrill.bmp", n))
Display them.
plt.figure(figsize = (7,7))
for i in range(4):
imageplot(fList[: , : , i], '', [2,2,i+1])
Exercise 12
Compare the approximation error decay for those images. Display $ \log_{10}(\norm{f-f_M}) $ as a function of $\log_{10}(M)$.
run -i nt_solutions/coding_1_approximation/exo12
## Insert your code here.