In this project we'll be building a Dog and Cat classifier
from scratch using Logistic Regression
from scikit-learn
library. For some supplementary information about this project, refer to the 4th article of Demystifying Machine Learning series.
In this tutorial we'll be more focussing towards
- How to gather data ?
- How to pre-process data for training purpose
- How to label the images as `cat` or `dog` for training purpose
- How to create training and testing sets
And in the last we'll use our pre-processed data to train our logistic regression model. I'll show you how easy it is to train machine learning algorithms using scikit-learn
library which already contain wrappers for many ML algorithms.
requests
: for downloading the compressed (zipped) dataset.zipfile
: for extracting the dataset.os
: for getting path and file names from systemPIL
: for applying image transformations on imagesmatplotlib
: for displaying images and graphsnumpy
: for array manipulationssklearn
: library containing wrappers for ML algorithmsseaborn
: for plotting heatmaps# importing useful libraries
import requests
import zipfile
import os
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
import seaborn as sns
np.random.seed(42)
Below are some helper functions for this notebook to make some work easier.
download
: to download the compressed version of dataset directly from this notebookextract_zip
: to extract the zipped filecenter_crop
: to crop any image from center about the given sizedef download(url, file_name):
"""
Downloads the dataset for project
input:
url (string): url for dataset
file_name (string): file name in which downloaded dataset will be stored
"""
response = requests.get(url, stream=True)
if response.status_code == 200:
with open(file_name, 'wb') as f:
f.write(response.raw.read())
def extract_zip(s_path, d_path):
"""
Extract (unzip) the compressed dataset for working
input:
s_path (string): path of zipped dataset
d_path (string): path to store the unzipped dataset
"""
with zipfile.ZipFile(s_path, 'r') as zip_ref:
zip_ref.extractall(d_path)
def center_crop(image_path, size):
"""
crop the image from center of the given size
input:
image_path (string): Path of image
size (int): size to which the image being cropped from center
"""
img = Image.open(image_path)
img = img.resize((size+1,size+1))
x_center = img.width/2
y_center = img.height/2
size = size/2
cr = img.crop((x_center-size, y_center-size, x_center+size, y_center+size))
return cr
download("https://s3.amazonaws.com/content.udacity-data.com/nd089/Cat_Dog_data.zip", "images.zip")
extract_zip("images.zip", "images")
Image.open()
: takes the path of image and gives an JpegImageFile
object, through which we can plot itplt.imshow()
: takes that JpegImageFile
object and plot the image as a graphimg = Image.open("./images/Cat_Dog_data/train/cat/cat.0.jpg")
img.format, type(img)
('JPEG', PIL.JpegImagePlugin.JpegImageFile)
plt.imshow(img)
<matplotlib.image.AxesImage at 0x7fa5e3294220>
In this section, we take all the names of cats and dogs images and save them into two separate lists.
os.listdir()
function takes the path of folder and returns a list containg the names of lists and directories inside that folder.
Here we are taking the name of all images from the training images of cats and dogs.
Training images are inside the folder
Cat_Dog_data/train
cat_list = os.listdir("./images/Cat_Dog_data/train/cat")
dog_list = os.listdir("./images/Cat_Dog_data/train/dog")
We are only taking 5000 images from each category for training for tutorial purpose and reducing the computation time. You can take all if you want.
You may not feel good when taking entire data, because logistic regression is not that good with such large number of features. In future we'll learn Neural networks that will give remarkable results even with much large number of features
cat_list = cat_list[:5000]
dog_list = dog_list[:5000]
len(cat_list), len(dog_list)
(5000, 5000)
Here we are creating a new list to contain the names of both cats and dogs so that we can shuffle them instead of keeping cat images in top and dogs in bottom
.extend()
is a list
method that append itmes of an entire list in the end of another listnp.random.shuffle()
takes the list/array and randomly shuffle it's elementstrain_list = []
train_list.extend(cat_list)
train_list.extend(dog_list)
np.random.shuffle(train_list)
## shuffled version of `train_list`
train_list[:10]
['dog.5906.jpg', 'cat.8133.jpg', 'cat.1388.jpg', 'cat.11968.jpg', 'cat.5833.jpg', 'dog.1962.jpg', 'cat.5248.jpg', 'dog.2138.jpg', 'dog.4173.jpg', 'cat.416.jpg']
In this section we'll use the train_list
that contains the names of all training images and we convert those images into numpy array because our logistic regression only works on numbers not any .jpg
file so we store them inside the train_data
(a numpy array) so that our model can learn from them.
We had 5000 images of cats and dogs each that means we had total 10,000 images. But all these images have different shapes like height and width, so we need to crop them into a fix size so that our model can easily learn from them by specifying requied weights.
We are going to crop each image into 100x100 size and since they are RGB images so the final shape of each of our image will be 100x100x3
Step 1: we will be creating an array name as train_data
that will contain all zeros, then later we'll be assigning all the images into each row of this array
np.zeros((10000, 100*100*3))
: creates an array of (10000, 30000) shape with all 0s
train_data = np.zeros((10000,100*100*3))
Step 2: iterating through train_list
and cropping each image from center in 100x100 size, then converting PIL
object into numpy array and storing it into train_data
.reshape(-1)
. So that we can store it into train_data
.all the 100x100x3 values will act as features and number of images will be total number of samples in dataset.
for i, image_name in enumerate(train_list):
if image_name.split(".")[0] == "dog":
path = "./images/Cat_Dog_data/train/dog"
else:
path = "./images/Cat_Dog_data/train/cat"
image_path = f'{path}/{image_name}'
crp_img = center_crop(image_path,100)
crp_arr = np.array(crp_img).reshape(-1)
train_data[i] = crp_arr
train_data[0]
array([200., 198., 201., ..., 148., 83., 49.])
Step 3:: Scaling array values between 0 and 1 so that our algorithm can converge nicely.
We know that value of any pixel can range between 0-255 so if we divide all the elements by 255 then it'll scale them between 0-1
train_data = train_data/255
train_data[0]
array([0.78431373, 0.77647059, 0.78823529, ..., 0.58039216, 0.3254902 , 0.19215686])
Now our training data is ready. The only thing remains is to provide labels for our training set so that our algorithm can learn which image is cat and which one is dog.
We'll be denoting:
- `cat` as 0
- `dog` as 1
We can loop through train_list
and check if their name contains word cat
then mark that image as cat otherwise dog
for example let's take a sample image:
cat.8133.jpg
we can see that if we somehow manage to grab the word before 1st '.' and check it for cat
or dog
then we can label that specific image
split()
: is a list
method which takes a character as input and split the string from given character into a stringprint("printing the name of some image")
print("-> ",train_list[0])
print("Splitting the image from all . characters into a list")
print("-> ",train_list[0].split("."))
print("selecting the 0th element of splitted list")
print("-> ",train_list[0].split(".")[0])
printing the name of some image -> dog.5906.jpg Splitting the image from all . characters into a list -> ['dog', '5906', 'jpg'] selecting the 0th element of splitted list -> dog
Now we can use the same above technique to check whether name contains the word cat
or dog
and label that sample as 0 and 1 respectively
# cat: 0
# dog: 1
train_labels = np.array([0 if name.split(".")[0]=="cat" else 1 for name in train_list])
train_labels.shape
(10000,)
In this section we'll be using sklearn
library to use LogisticRegression
for training a model and you'll how easy it'll be
LogisticRegression
class lies inside sklearn.linear_model
, you can learn more about it from the docs. Now create an instance of it's by passing various hyperparameters like:
max_iter
: number of epochssolver
: which solving algo to usen_jobs
: number of CPU cores to use for parallel computationfor tutorial purpose and fast computation I'm setting max_iter = 100, but you can set it to some large value for better result
model = LogisticRegression(max_iter=100, n_jobs=-1)
now just train the model by calling .fit()
method on model and passing training_data
and train_labels
model.fit(train_data, train_labels)
/Users/swayam/Desktop/demystifying_machine_learning/venv/lib/python3.8/site-packages/sklearn/linear_model/_logistic.py:814: ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. Increase the number of iterations (max_iter) or scale the data as shown in: https://scikit-learn.org/stable/modules/preprocessing.html Please also refer to the documentation for alternative solver options: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression n_iter_i = _check_optimize_result(
LogisticRegression(n_jobs=-1)
you can print the accuracy of your model by calling .score()
method on model by passing data and labels
model.score(train_data, train_labels)
0.7836
Let's open a random test image from Cat_Dog_dara/test
folder and use model to see how its classifying it
test_img = "./images/Cat_Dog_data/test/cat/cat.10038.jpg"
img = Image.open(test_img)
plt.imshow(img)
<matplotlib.image.AxesImage at 0x7fa5c1184e80>
Note: Before predicting you need to convert the sample testing image into the same size as we used while training images i.e. in 100x100x3
im = center_crop(test_img,100) # cropping image
X = np.array(im).reshape(-1) # flattening the image to pass in model for prediction
X = X/255 # scale the pixels in 0-1 range
use .predict()
method on model by passing array of testing image to get the prediction
if output is:
model.predict([X])
array([0])
Confusion matrix tells us how many samples are correctly classified and how many are incorrectly classified.
We'll be using sklearn.metrics
to import confusion_matrix
. confusion_matrix
takes preds
and true labels
as input and returns a matrix.
train_pred = model.predict(train_data)
cm = confusion_matrix(train_pred, train_labels)
cm
array([[3964, 1128], [1036, 3872]])
we can better visualize that confusion_matrix
by a heatmap. Seaborn
package provide facility to plot heatmaps very easily by passing matrix
After plotting you can verify it as:
0th row and 0th column :- number denotes the samples that are 0 and also predicted as 0
0th row and 1th column :- number denotes the samples that are 1 and also predicted as 0
1th row and 0th column :- number denotes the samples that are 0 and also predicted as 1
1th row and 1th column :- number denotes the samples that are 1 and also predicted as 1
sns.heatmap(cm, annot=True)
<AxesSubplot:>
Just similar to steps we did for preparing training data
are going to be taken here too, just the folder for getting image data will be different Cat_Dog_data/test/
test_data_cat = os.listdir("./images/Cat_Dog_data/test/cat")
test_data_dog = os.listdir("./images/Cat_Dog_data/test/dog")
only taking 200 test cat images and 200 test dog images
test_data_cat = test_data_cat[:200]
test_data_dog = test_data_dog[:200]
test_list = []
test_list.extend(test_data_cat)
test_list.extend(test_data_dog)
len(test_list)
400
test_data = np.zeros((400,100*100*3))
for i, image_name in enumerate(test_list):
if image_name.split(".")[0] == "dog":
path = "./images/Cat_Dog_data/test/dog"
else:
path = "./images/Cat_Dog_data/test/cat"
image_path = f'{path}/{image_name}'
crp_img = center_crop(image_path,100)
crp_arr = np.array(crp_img).reshape(-1)
test_data[i] = crp_arr
test_data = test_data/255
test_labels = np.array([0 if name.split(".")[0]=="cat" else 1 for name in test_list])
pred = model.predict(test_data)
cm = confusion_matrix(pred, test_labels)
print(cm)
sns.heatmap(cm, annot=True)
[[109 93] [ 91 107]]
<AxesSubplot:>
test_acc = model.score(test_data, test_labels)
print("Accuracy on test set: ", test_acc)
Accuracy on test set: 0.54
In this section, we will learn to pass our own custom images of dogs and cats and let the model predict them
here we have 2 helper functions:
show_image
: this function takes the path of image as input and display itpredict_custom_image
: this function takes model
and image_path
as input and returns the predicted valuedef show_image(img_path):
img = Image.open(img_path)
plt.imshow(img)
def predict_custom_image(model, img_path):
crp_img = center_crop(img_path,100)
crp_arr = np.array(crp_img).reshape(1,-1)
pred = model.predict(crp_arr)
if pred == 0:
return "Cat"
return "Dog"
test_img_path = "test.jpg" # provide path to your custom image (make sure it's either jpg or jpeg)
show_image(test_img_path)
predict_custom_image(model, test_img_path)
'Dog'