conda env create -f eden_transfer_learning.yml
Note: If you find any issues while executing the notebook, don't hesitate to open an issue on Github. We will reply you as soon as possible.
This notebook aims to provide some initial experience on simple image processing techniques, in order to provide a better understanding of what an "image" is from a computer vision perspective. Image (pre-)processing is an integral step in any artificial intelligence process. There are many different techniques, such as histogram equalization, median filter, negative image transformation, etc. In this notebook we will focus on the histogram equalization technique. Other techniques could be found in the notebooks published in the Eden Library platform. Take a look!!
Contrast is a parameter that expresses how bright and dark pixels are distributed in the image. Because of complex dynamic lightning conditions or wrong camera configurations, the bright and dark areas of some images could blend together, creating images with a large number of either very dark or very bright pixels that make distinguishing certain relevant features significantly harder. Consequently, this problem could even reduce the performance of a final machine learning classifier. As shown in the figure, the pixel values are gathered around certain peaks on the histogram (See left side of the figure).
Histogram equalization is a basic image processing technique that adjusts the global contrast of an image by updating the image histogram’s pixel intensity distribution. As a result, areas of low contrast obtain higher contrast in the output image. Summing up: The result of applying histogram equalization is an image with higher global contrast. (See right side of the figure).
From a computer vision point of view, it could be said that an image is a multi-layer 2d matrix. Each cell inside the matrices represents the respective pixel in the image. These pixels are the values recorded by the sensor (e.g. camera) for a specific position and spectral band. The most common form of imagery is RGB which consists of three layers for the Red, Green and Blue spectra.
UPDATES
import matplotlib.pyplot as plt
import numpy as np
import cv2
from tqdm import tqdm
from glob import glob
from pathlib import Path
import random
def plot_sample(X):
nb_rows = 3
nb_cols = 3
fig, axs = plt.subplots(nb_rows, nb_cols, figsize=(6, 6))
for i in range(0, nb_rows):
for j in range(0, nb_cols):
axs[i, j].xaxis.set_ticklabels([])
axs[i, j].yaxis.set_ticklabels([])
axs[i, j].imshow(X[random.randint(0, X.shape[0])-1])
def read_data(path_list, im_size=(128,128)):
X = []
for path in path_list :
for im_file in tqdm(glob(path + '*/*')):
try:
im = cv2.imread(im_file)
# Resize to appropriate dimensions.You can try different interpolation methods.
im = cv2.resize(im, im_size,interpolation=cv2.INTER_LINEAR)
# By default OpenCV read with BGR format, return back to RGB.
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
X.append(im)
except Exception as e:
# In case annotations or metadata are found
print("Not a picture")
X = np.array(X)# Convert list to numpy array.
return X
def plot_channel_histogram(image, ax):
hist, bins = np.histogram(image.flatten(), 256, [0, 256])
cdf = hist.cumsum()
cdf_normalized = cdf * float(hist.max()) / cdf.max()
ax.plot(cdf_normalized, color = 'b')
ax.hist(image.flatten(),256,[0,256], color = 'r')
ax.legend(('cdf','histogram'), loc = 'upper left')
def plot_image_histogram(image, ax):
color = ('r','g','b')
for i, col in enumerate(color):
histr = cv2.calcHist([image],[i],None,[256],[0,256])
ax.plot(histr,color = col)
IM_SIZE = (128, 128)
# Datasets' paths we want to work on.
PATH_LIST = ['eden_data/Black nightsade-220519-Weed-zz-V1-20210225102034']
Some of the pictures inside this dataset are overexposed. Therefore, we will use histogram equalization to improve the image quality.
i=0
for path in PATH_LIST:
#Define paths in an OS agnostic way.
PATH_LIST[i] = str(Path(Path.cwd()).parents[0].joinpath(path))
i+=1
X = read_data(PATH_LIST, IM_SIZE)
plot_sample(X)
100%|██████████| 123/123 [00:26<00:00, 4.64it/s]
SAMPLE_IMAGE_INDEX = 30
im_sample = X[SAMPLE_IMAGE_INDEX]
plt.imshow(im_sample)
plt.show()
In this first example, we will work with a grayscale version of the picture. In other words, only 1 channel is going to be equalized.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12,4))
fig.suptitle('Overexposed picture')
# Transform RGB image into a grayscale version
gray = cv2.cvtColor(im_sample, cv2.COLOR_RGB2GRAY)
ax1.imshow(gray, cmap="gray")
plot_channel_histogram(gray, ax2)
plt.show()
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12,4))
fig.suptitle('Overexposed picture AFTER Histogram Equalization')
# Equalizing grayscale image
equ = cv2.equalizeHist(gray)
ax1.imshow(equ, cmap="gray")
plot_channel_histogram(equ, ax2)
plt.show()
In this second example, we will work with the original RGB version of the picture. For that reason, each of the 3 channels will be equalized separately.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12,4))
fig.suptitle('Overexposed picture')
ax1.imshow(im_sample)
plot_image_histogram(im_sample, ax2)
plt.show()
To perform histogram equalization on a multi-channel image it is necessary to:
NUM_CHANNELS = 3
# Algorithm: Steps 1 and 2
eqs = [cv2.equalizeHist(im_sample[:,:,i])[:,:,np.newaxis] for i in range(NUM_CHANNELS)]
# Algorithm: Step 3
equalized_image = cv2.merge((eqs[0], eqs[1], eqs[2]))
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12,4))
fig.suptitle('Overexposed picture AFTER Histogram Equalization')
ax1.imshow(equalized_image)
plot_image_histogram(equalized_image, ax2)
plt.show()
We can also rearrange the order that the bands are displayed in. This technique is used by many applications in multispectral data and is widely referred to as "pseudo-color", or simply false colour composites.
plt.imshow(cv2.merge((eqs[1], eqs[0], eqs[2])))
plt.show()