En este cuaderno Jupyter aprenderás a clasificar imágenes de mascotas de forma automática, utilizando la potencia de las redes neuronales convolucionales, la técnica puntera que ha supuesto el boom del Deep Learning.
Habremos descargado el conjunto de datos Oxford pets del URL https://www.robots.ox.ac.uk/~vgg/data/pets/ y extraído las imágenes a una carpeta images
. La siguiente celda organiza los archivos en dos clases (perros y gatos) y en dos subconjuntos (entrenamiento y test) para facilitar las tareas posteriores:
import os
images_path = "images"
annotations_path = "annotations"
trainval = open(os.path.join(annotations_path, "trainval.txt")).readlines()
test = open(os.path.join(annotations_path, "test.txt")).readlines()
os.makedirs(os.path.join(images_path, "train", "cats"), exist_ok=True)
os.makedirs(os.path.join(images_path, "train", "dogs"), exist_ok=True)
os.makedirs(os.path.join(images_path, "test", "cats"), exist_ok=True)
os.makedirs(os.path.join(images_path, "test", "dogs"), exist_ok=True)
def classify_image(line, subset):
basename = line.split(" ")[0]
species = line.split(" ")[2]
subfolder = "cats" if species == "1" else "dogs"
oldpath = os.path.join(images_path, f"{basename}.jpg")
newpath = os.path.join(images_path, subset, subfolder, f"{basename}.jpg")
if os.path.isfile(oldpath):
os.rename(oldpath, newpath)
for line in trainval:
classify_image(line, "train")
for line in test:
classify_image(line, "test")
from tensorflow.keras.preprocessing.image import ImageDataGenerator
generador_entrenamiento = ImageDataGenerator()
datos_entrenamiento = generador_entrenamiento.flow_from_directory("images/train")
generador_test = ImageDataGenerator()
datos_test = generador_test.flow_from_directory("images/test", class_mode=None)
algunas_imagenes = next(datos_test)
Found 3680 images belonging to 2 classes. Found 3669 images belonging to 2 classes.
Podemos visualizar algún ejemplo de imagen a continuación:
from matplotlib import pyplot as plt
plt.imshow(algunas_imagenes[0]/255.)
plt.axis('off')
plt.show()
plt.imshow(algunas_imagenes[1]/255.)
plt.axis('off')
plt.show()
Nuestro objetivo será crear un modelo capaz de responder a la pregunta "¿Corresponde esta imagen a un gato o a un perro?". En lugar de diseñar una nueva red neuronal desde cero, podemos cargar una red ya construida y, mejor aún, los parámetros optimizados para el conjunto de datos Imagenet
de todo tipo de imágenes, de forma que nuestra red viene ya "preparada" para reconocer imágenes y no partimos de cero al entrenar. Esta estrategia se conoce como transfer learning.
Importaremos la red InceptionV3 desde la biblioteca de modelos ya entrenados de Tensorflow. Esta red se basa en un componente llamado "bloque Inception": encadena varios de estos bloques para extraer información de la imagen.
from tensorflow.keras import applications
inception = applications.InceptionV3(include_top=False, input_shape=(256, 256, 3))
En la siguiente celda añadimos a la red InceptionV3 un par de capas que nos permiten obtener una predicción a partir de la información que haya inferido de la imagen.
from tensorflow.keras.layers import Flatten, Dense
from tensorflow.keras.models import Sequential
predictor = Sequential([
Flatten(),
Dense(128, activation="relu"),
Dense(2, activation="softmax")
])
modelo = Sequential([inception, predictor])
modelo.compile(optimizer="adam", loss="categorical_crossentropy")
Una vez creado el modelo que ya tiene la estructura final para responder preguntas de "sí/no", ajustamos sus parámetros (que inicialmente son aleatorios) al conjunto de imágenes que vamos a utilizar para entrenar:
modelo.fit(datos_entrenamiento, epochs=50)
Epoch 1/50 115/115 [==============================] - 109s 461ms/step - loss: 1.1375 Epoch 2/50 115/115 [==============================] - 43s 371ms/step - loss: 0.4231 Epoch 3/50 115/115 [==============================] - 32s 280ms/step - loss: 0.1612 Epoch 4/50 115/115 [==============================] - 31s 270ms/step - loss: 0.0736 Epoch 5/50 115/115 [==============================] - 32s 279ms/step - loss: 0.0439 Epoch 6/50 115/115 [==============================] - 33s 283ms/step - loss: 0.0357 Epoch 7/50 115/115 [==============================] - 32s 280ms/step - loss: 0.0344 Epoch 8/50 115/115 [==============================] - 32s 280ms/step - loss: 0.0357 Epoch 9/50 115/115 [==============================] - 33s 284ms/step - loss: 0.0395 Epoch 10/50 115/115 [==============================] - 32s 278ms/step - loss: 0.0394 Epoch 11/50 115/115 [==============================] - 33s 288ms/step - loss: 0.0155 Epoch 12/50 115/115 [==============================] - 32s 279ms/step - loss: 0.0302 Epoch 13/50 115/115 [==============================] - 32s 278ms/step - loss: 0.0486 Epoch 14/50 115/115 [==============================] - 31s 271ms/step - loss: 0.0389 Epoch 15/50 115/115 [==============================] - 31s 271ms/step - loss: 0.0221 Epoch 16/50 115/115 [==============================] - 31s 272ms/step - loss: 0.0352 Epoch 17/50 115/115 [==============================] - 35s 303ms/step - loss: 0.0231 Epoch 18/50 115/115 [==============================] - 57s 492ms/step - loss: 0.0516 Epoch 19/50 115/115 [==============================] - 33s 287ms/step - loss: 0.0279 Epoch 20/50 115/115 [==============================] - 34s 292ms/step - loss: 0.0171 Epoch 21/50 115/115 [==============================] - 33s 283ms/step - loss: 0.0230 Epoch 22/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0316 Epoch 23/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0178 Epoch 24/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0012 Epoch 25/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0198 Epoch 26/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0137 Epoch 27/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0415 Epoch 28/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0333 Epoch 29/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0057 Epoch 30/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0257 Epoch 31/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0314 Epoch 32/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0314 Epoch 33/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0223 Epoch 34/50 115/115 [==============================] - 31s 269ms/step - loss: 0.0203 Epoch 35/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0158 Epoch 36/50 115/115 [==============================] - 31s 268ms/step - loss: 0.0242 Epoch 37/50 115/115 [==============================] - 31s 268ms/step - loss: 0.3281 Epoch 38/50 115/115 [==============================] - 31s 268ms/step - loss: 0.4651 Epoch 39/50 115/115 [==============================] - 31s 268ms/step - loss: 0.3649 Epoch 40/50 115/115 [==============================] - 31s 268ms/step - loss: 0.2986 Epoch 41/50 115/115 [==============================] - 31s 266ms/step - loss: 0.2128 Epoch 42/50 115/115 [==============================] - 31s 266ms/step - loss: 0.1812 Epoch 43/50 115/115 [==============================] - 31s 266ms/step - loss: 0.1332 Epoch 44/50 115/115 [==============================] - 31s 266ms/step - loss: 0.1165 Epoch 45/50 115/115 [==============================] - 31s 266ms/step - loss: 0.0956 Epoch 46/50 115/115 [==============================] - 31s 266ms/step - loss: 0.1310 Epoch 47/50 115/115 [==============================] - 31s 266ms/step - loss: 0.1147 Epoch 48/50 115/115 [==============================] - 31s 266ms/step - loss: 0.0718 Epoch 49/50 115/115 [==============================] - 31s 266ms/step - loss: 0.0744 Epoch 50/50 115/115 [==============================] - 31s 266ms/step - loss: 0.0772
<tensorflow.python.keras.callbacks.History at 0x1bed8fb0cd0>
Nuestro modelo ya está listo. En la siguiente celda tomamos algunas imágenes del subconjunto de test (imágenes que nunca han sido vistas por la red neuronal) y comprobamos cuáles son las predicciones del modelo: ¿acertará todos los perros y gatos?
lote_test = next(datos_test)
probs = modelo.predict(lote_test)
import numpy as np
clase = np.argmax(probs, -1)
mostrar_imagenes = 10
for i in range(mostrar_imagenes):
plt.imshow(lote_test[i]/255.)
plt.axis('off')
plt.show()
print("Predicción:", "perro" if clase[i] else "gato")
Predicción: perro
Predicción: perro
Predicción: gato
Predicción: perro
Predicción: gato
Predicción: perro
Predicción: perro
Predicción: gato
Predicción: perro
Predicción: perro