Notebook

Image colorization¶

$$ \newcommand{\eg}{{\it e.g.}} \newcommand{\ie}{{\it i.e.}} \newcommand{\argmin}{\operatornamewithlimits{argmin}} \newcommand{\mc}{\mathcal} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathbf} \newcommand{\minimize}{{\text{minimize}}} \newcommand{\argmin}{{\text{argmin}}} \newcommand{\diag}{{\text{diag}}} \newcommand{\cond}{{\text{cond}}} \newcommand{\rank}{{\text{rank }}} \newcommand{\range}{{\mathcal{R}}} \newcommand{\null}{{\mathcal{N}}} \newcommand{\tr}{{\text{trace}}} \newcommand{\dom}{{\text{dom}}} \newcommand{\dist}{{\text{dist}}} \newcommand{\E}{\mathbf{E}} \newcommand{\var}{\mathbf{var}} \newcommand{\R}{\mathbf{R}} \newcommand{\B}{\mathbf{B}} \newcommand{\SM}{\mathbf{S}} \newcommand{\ball}{\mathcal{B}} \newcommand{\bmat}[1]{\begin{bmatrix}#1\end{bmatrix}} $$

EE786: Optimization applications, Kyung Hee University.

Jong-Han Kim (jonghank@khu.ac.kr)

A color image is represented by an $m\times n \times 3$ matrix of RGB intensities, so each pixel $p_{ij}$ can be represented by a three dimensional vector,

$$ p_{ij}=\bmat{r_{ij} \\ g_{ij} \\ b_{ij} }\in\R^3 $$

A natural grayscale intensity $w_{ij}$ can be obtained from the $r_{ij}$, $g_{ij}$, $b_{ij}$ intensities by the following relations, i.e., the grayscale intensity can be interpreted as a linear combination of the R/G/B intensity values.

\begin{align*} w_{ij} &= 0.299r_{ij} + 0.587g_{ij} + 0.114b_{ij} \\ &= \bmat{0.299 & 0.587 & 0.114}p_{ij} \end{align*}

In this problem you will be given an $m\times n \times 3$ image, most of whose color information is lost, and you will work on reconstructing the lost color information.

The following describes how those full color information or grayscale information are represented: Each pixel in a color image consists of a three dimensional vector whose components describes the R/G/B intensities, however in this mostly colorless image, approximately 2% pixels contain those color information,

$$ p_{ij} = \bmat{r_{ij} \\ g_{ij} \\ b_{ij}} $$

while the rest 98% pixels contain three dimensional vectors whose three components are the grayscale intensity.

$$ p_{ij} = \bmat{w_{ij} \\ w_{ij} \\ w_{ij}} $$

So by looking at the three components in each pixel, you can identify whether the corresponding pixel contains the correct color information, or not. If a pixel contains a three dimensional vector with all different components, that means the pixel contains the correct color information. If a pixel contains a three dimensional vector with identical components, that means the pixel's color information is lost.

Your job is to reconstruct the color information for those 98% pixels.

The following loads an $m\times n \times 3$ mostly colorless PNG image.

In [ ]:

import numpy as np
import matplotlib.pyplot as plt

khu = plt.imread('http://jonghank.github.io/ee786/files/khu_x_2.png')
m, n = khu.shape[0:2]

khu_r = khu[:,:,0]
khu_g = khu[:,:,1]
khu_b = khu[:,:,2]

plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(khu)
plt.axis('off')
plt.title('Loaded image')
plt.show()

The image pretty much looks like a grayscale image. However if you zoom in, you will see some of them actually contain the color information.

In [ ]:

plt.figure()
plt.imshow(khu[110:150, 20:60, :])
plt.axis('off')
plt.title('Image zoomed in')
plt.show()

Identify the pixels with full color information, by comparing R/G/B intensities. In other words, find $m_\text{known}$, the row indices, and $n_\text{known}$, the column indices for the pixels with full color information, so that

$$ p_{ij} = \bmat{r_{ij} \\ g_{ij} \\ b_{ij}}, \qquad \text{if } i\in m_\text{known} \text{ and } j\in n_\text{known} $$

and $p_{ij} = \bmat{w_{ij} & w_{ij} & w_{ij}}^T$ otherwise.

In [ ]:

# your code here

diff_rgb = np.abs(khu_r - khu_g) + np.abs(khu_g - khu_b)
m_known, n_known = np.where(diff_rgb>1e-3)

Known_pixels = np.zeros((m,n,3))

Known_pixels[m_known, n_known, 0] = khu_r[m_known, n_known]
Known_pixels[m_known, n_known, 1] = khu_g[m_known, n_known]
Known_pixels[m_known, n_known, 2] = khu_b[m_known, n_known]

plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(Known_pixels)
plt.axis('off')
plt.title('Pixels with full color information')
plt.show()

Now find the grayscale intensities for all the pixels, i.e., compute $w_{ij}$ for all $i$ and $j$.

In [ ]:

# your code here

khu_gray = 0.299*khu_r + 0.587*khu_g + 0.114*khu_b

plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(khu_gray, cmap='gray')
plt.axis('off')
plt.title('Full grayscale image')
plt.show()

Now paint the image with appropriate colors, by reconstructing the lost color information. So more formally, find

$$ p_{ij} = \bmat{r_{ij} \\ g_{ij} \\ b_{ij}}, \qquad \text{for } i\notin m_\text{known} \text{ or } j\notin n_\text{known} $$

that minimizes the $\ell_2$ total variation of the image defined by

$$ \text{tv}(p) = \sum_i^{m-1} \sum_j ^{n-1} \left\| \bmat{p_{i+1,j}-p_{ij} \\ p_{i,j+1}-p_{ij} } \right\|_2 $$

You may refer to the TV inpainting example page from the cvxpy tutorials at https://www.cvxpy.org/examples/applications/tv_inpainting.html, for handling color images for cvxpy.tv() function.

Also, your choice of objective function is not limited to the above total variation function. You can change it, or you can add regularizers, or you can do whatever you want. The job is just to find a natural color image that (somehow) matches the given information.

In [ ]:

# your code here

import cvxpy as cp

Xr = cp.Variable((m,n))
Xg = cp.Variable((m,n))
Xb = cp.Variable((m,n))

variables = [Xr, Xg, Xb]

obj = cp.tv(*variables) + 1000*cp.sum_squares(0.299*Xr+0.587*Xg+0.114*Xb-khu_gray)
obj = cp.Minimize(obj)
constraints = [
    Xr[m_known, n_known] == khu_r[m_known, n_known] ,
    Xg[m_known, n_known] == khu_g[m_known, n_known] , 
    Xb[m_known, n_known] == khu_b[m_known, n_known] 
]

prob = cp.Problem(obj, constraints)
prob.solve(verbose=True, solver=cp.SCS)

Xrec = np.zeros((m, n, 3))
Xrec[:,:,0] = np.clip(Xr.value, 0, 1)
Xrec[:,:,1] = np.clip(Xg.value, 0, 1)
Xrec[:,:,2] = np.clip(Xb.value, 0, 1)

plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(khu)
plt.title('Grayscale with 2% color pixels')
plt.axis('off')
plt.show()
plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(Xrec)
plt.title('Colorized')
plt.axis('off')
plt.show()

----------------------------------------------------------------------------
	SCS v2.1.0 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
----------------------------------------------------------------------------
Lin-sys: sparse-direct, nnz in A = 2479991
eps = 1.00e-05, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00
acceleration_lookback = 10, rho_x = 1.00e-03
Variables n = 619570, constraints m = 1244178
Cones:	primal zero / dual free vars: 9192
	linear vars: 1
	soc vars: 1234985, soc blks: 154270
Setup time: 3.66e+01s
----------------------------------------------------------------------------
 Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
----------------------------------------------------------------------------
     0| 1.08e+22  1.44e+22  1.00e+00 -1.26e+27  8.94e+25  9.19e+24  6.74e-01 
   100| 8.73e-05  1.53e-04  7.19e-06  1.05e+04  1.05e+04  2.05e-12  6.11e+01 
   200| 1.10e-05  1.94e-05  3.74e-07  1.05e+04  1.05e+04  3.33e-12  1.21e+02 
   260| 4.29e-06  8.24e-06  1.44e-07  1.05e+04  1.05e+04  1.19e-11  1.57e+02 
----------------------------------------------------------------------------
Status: Solved
Timing: Solve time: 1.57e+02s
	Lin-sys: nnz in L factor: 43611055, avg solve time: 2.72e-01s
	Cones: avg projection time: 3.32e-03s
	Acceleration: avg step time: 2.66e-01s
----------------------------------------------------------------------------
Error metrics:
dist(s, K) = 2.2204e-16, dist(y, K*) = 2.2204e-16, s'y/|s||y| = 2.7560e-16
primal res: |Ax + s - b|_2 / (1 + |b|_2) = 4.2940e-06
dual res:   |A'y + c|_2 / (1 + |c|_2) = 8.2387e-06
rel gap:    |c'x + b'y| / (1 + |c'x| + |b'y|) = 1.4387e-07
----------------------------------------------------------------------------
c'x = 10529.0138, -b'y = 10529.0108
============================================================================

Compare your painting with the original color image which can be downloaded from http://jonghank.github.io/ee786/files/khu_entrance.png. The original file is provided just for comparison; you are not allowed to use anything from it.

In [ ]:

# your code here

khu_color = plt.imread('http://jonghank.github.io/ee786/files/khu_entrance.png')

plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(khu)
plt.title('Grayscale with 2% color pixels')
plt.axis('off')
plt.subplot(122)
plt.imshow(Known_pixels)
plt.axis('off')
plt.title('Pixels with full color information')
plt.show()
plt.figure(figsize=(18,9))
plt.subplot(121)
plt.imshow(Xrec)
plt.title('Colorized')
plt.axis('off')
plt.subplot(122)
plt.imshow(khu_color)
plt.title('Original color image')
plt.axis('off')
plt.show()

In [ ]: