Categorizando mascotas con redes neuronales

Introducción

En este cuaderno Jupyter aprenderás a clasificar imágenes de mascotas de forma automática, utilizando la potencia de las redes neuronales convolucionales, la técnica puntera que ha supuesto el boom del Deep Learning.

Carga de datos

Habremos descargado el conjunto de datos Oxford pets del URL https://www.robots.ox.ac.uk/~vgg/data/pets/ y extraído las imágenes a una carpeta images. La siguiente celda organiza los archivos en dos clases (perros y gatos) y en dos subconjuntos (entrenamiento y test) para facilitar las tareas posteriores:

In [1]:
import os

images_path = "images"
annotations_path = "annotations"

trainval = open(os.path.join(annotations_path, "trainval.txt")).readlines()
test = open(os.path.join(annotations_path, "test.txt")).readlines()

os.makedirs(os.path.join(images_path, "train", "cats"), exist_ok=True)
os.makedirs(os.path.join(images_path, "train", "dogs"), exist_ok=True)
os.makedirs(os.path.join(images_path, "test", "cats"), exist_ok=True)
os.makedirs(os.path.join(images_path, "test", "dogs"), exist_ok=True)

def classify_image(line, subset):
    basename = line.split(" ")[0]
    species = line.split(" ")[2]
    subfolder = "cats" if species == "1" else "dogs"
    oldpath = os.path.join(images_path, f"{basename}.jpg")
    newpath = os.path.join(images_path, subset, subfolder, f"{basename}.jpg")
    if os.path.isfile(oldpath):
        os.rename(oldpath, newpath)

for line in trainval:
    classify_image(line, "train")

for line in test:
    classify_image(line, "test")
In [2]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

generador_entrenamiento = ImageDataGenerator()
datos_entrenamiento = generador_entrenamiento.flow_from_directory("images/train")
generador_test = ImageDataGenerator()
datos_test = generador_test.flow_from_directory("images/test", class_mode=None)
algunas_imagenes = next(datos_test)
Found 3680 images belonging to 2 classes.
Found 3669 images belonging to 2 classes.

Visualización

Podemos visualizar algún ejemplo de imagen a continuación:

In [3]:
from matplotlib import pyplot as plt
plt.imshow(algunas_imagenes[0]/255.)
plt.axis('off')
plt.show()
plt.imshow(algunas_imagenes[1]/255.)
plt.axis('off')
plt.show()

Carga del modelo

Nuestro objetivo será crear un modelo capaz de responder a la pregunta "¿Corresponde esta imagen a un gato o a un perro?". En lugar de diseñar una nueva red neuronal desde cero, podemos cargar una red ya construida y, mejor aún, los parámetros optimizados para el conjunto de datos Imagenet de todo tipo de imágenes, de forma que nuestra red viene ya "preparada" para reconocer imágenes y no partimos de cero al entrenar. Esta estrategia se conoce como transfer learning.

Importaremos la red InceptionV3 desde la biblioteca de modelos ya entrenados de Tensorflow. Esta red se basa en un componente llamado "bloque Inception": encadena varios de estos bloques para extraer información de la imagen.

In [4]:
from tensorflow.keras import applications
inception = applications.InceptionV3(include_top=False, input_shape=(256, 256, 3))

Ajustes del modelo

En la siguiente celda añadimos a la red InceptionV3 un par de capas que nos permiten obtener una predicción a partir de la información que haya inferido de la imagen.

In [5]:
from tensorflow.keras.layers import Flatten, Dense
from tensorflow.keras.models import Sequential
predictor = Sequential([
    Flatten(), 
    Dense(128, activation="relu"), 
    Dense(2, activation="softmax")
])
modelo = Sequential([inception, predictor])
modelo.compile(optimizer="adam", loss="categorical_crossentropy")

Entrenamiento

Una vez creado el modelo que ya tiene la estructura final para responder preguntas de "sí/no", ajustamos sus parámetros (que inicialmente son aleatorios) al conjunto de imágenes que vamos a utilizar para entrenar:

In [6]:
modelo.fit(datos_entrenamiento, epochs=50)
Epoch 1/50
115/115 [==============================] - 109s 461ms/step - loss: 1.1375
Epoch 2/50
115/115 [==============================] - 43s 371ms/step - loss: 0.4231
Epoch 3/50
115/115 [==============================] - 32s 280ms/step - loss: 0.1612
Epoch 4/50
115/115 [==============================] - 31s 270ms/step - loss: 0.0736
Epoch 5/50
115/115 [==============================] - 32s 279ms/step - loss: 0.0439
Epoch 6/50
115/115 [==============================] - 33s 283ms/step - loss: 0.0357
Epoch 7/50
115/115 [==============================] - 32s 280ms/step - loss: 0.0344
Epoch 8/50
115/115 [==============================] - 32s 280ms/step - loss: 0.0357
Epoch 9/50
115/115 [==============================] - 33s 284ms/step - loss: 0.0395
Epoch 10/50
115/115 [==============================] - 32s 278ms/step - loss: 0.0394
Epoch 11/50
115/115 [==============================] - 33s 288ms/step - loss: 0.0155
Epoch 12/50
115/115 [==============================] - 32s 279ms/step - loss: 0.0302
Epoch 13/50
115/115 [==============================] - 32s 278ms/step - loss: 0.0486
Epoch 14/50
115/115 [==============================] - 31s 271ms/step - loss: 0.0389
Epoch 15/50
115/115 [==============================] - 31s 271ms/step - loss: 0.0221
Epoch 16/50
115/115 [==============================] - 31s 272ms/step - loss: 0.0352
Epoch 17/50
115/115 [==============================] - 35s 303ms/step - loss: 0.0231
Epoch 18/50
115/115 [==============================] - 57s 492ms/step - loss: 0.0516
Epoch 19/50
115/115 [==============================] - 33s 287ms/step - loss: 0.0279
Epoch 20/50
115/115 [==============================] - 34s 292ms/step - loss: 0.0171
Epoch 21/50
115/115 [==============================] - 33s 283ms/step - loss: 0.0230
Epoch 22/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0316
Epoch 23/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0178
Epoch 24/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0012
Epoch 25/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0198
Epoch 26/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0137
Epoch 27/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0415
Epoch 28/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0333
Epoch 29/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0057
Epoch 30/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0257
Epoch 31/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0314
Epoch 32/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0314
Epoch 33/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0223
Epoch 34/50
115/115 [==============================] - 31s 269ms/step - loss: 0.0203
Epoch 35/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0158
Epoch 36/50
115/115 [==============================] - 31s 268ms/step - loss: 0.0242
Epoch 37/50
115/115 [==============================] - 31s 268ms/step - loss: 0.3281
Epoch 38/50
115/115 [==============================] - 31s 268ms/step - loss: 0.4651
Epoch 39/50
115/115 [==============================] - 31s 268ms/step - loss: 0.3649
Epoch 40/50
115/115 [==============================] - 31s 268ms/step - loss: 0.2986
Epoch 41/50
115/115 [==============================] - 31s 266ms/step - loss: 0.2128
Epoch 42/50
115/115 [==============================] - 31s 266ms/step - loss: 0.1812
Epoch 43/50
115/115 [==============================] - 31s 266ms/step - loss: 0.1332
Epoch 44/50
115/115 [==============================] - 31s 266ms/step - loss: 0.1165
Epoch 45/50
115/115 [==============================] - 31s 266ms/step - loss: 0.0956
Epoch 46/50
115/115 [==============================] - 31s 266ms/step - loss: 0.1310
Epoch 47/50
115/115 [==============================] - 31s 266ms/step - loss: 0.1147
Epoch 48/50
115/115 [==============================] - 31s 266ms/step - loss: 0.0718
Epoch 49/50
115/115 [==============================] - 31s 266ms/step - loss: 0.0744
Epoch 50/50
115/115 [==============================] - 31s 266ms/step - loss: 0.0772
Out[6]:
<tensorflow.python.keras.callbacks.History at 0x1bed8fb0cd0>

Predicción

Nuestro modelo ya está listo. En la siguiente celda tomamos algunas imágenes del subconjunto de test (imágenes que nunca han sido vistas por la red neuronal) y comprobamos cuáles son las predicciones del modelo: ¿acertará todos los perros y gatos?

In [7]:
lote_test = next(datos_test)

probs = modelo.predict(lote_test)
import numpy as np
clase = np.argmax(probs, -1)
In [8]:
mostrar_imagenes = 10

for i in range(mostrar_imagenes):
    plt.imshow(lote_test[i]/255.)
    plt.axis('off')
    plt.show()
    print("Predicción:", "perro" if clase[i] else "gato")
Predicción: perro
Predicción: perro
Predicción: gato
Predicción: perro
Predicción: gato
Predicción: perro