To run any of Eden's notebooks, please check the guides on our Wiki page.
There you will find instructions on how to deploy the notebooks on your local system, on Google Colab, or on MyBinder, as well as other useful links, troubleshooting tips, and more.
Note: If you find any issues while executing the notebook, don't hesitate to open an issue on Github. We will try to reply as soon as possible.
In this notebook, we are going to cover a technique called Transfer Learning, which generally refers to a process where a machine learning model is trained on one problem, and afterwards, it is reused in some way on a second (probably) related problem (Bengio, 2012). Specifically, in deep learning, this technique is used by training only some layers of the pre-trained network. Its promise is that the training will be more efficient and in the best of the cases the performance will be better compared to a model trained from scratch.
Although the choice of an architecture is an important decision, other hyper-parameters such as the optimizer method can play a critical role in deep learning. Basically, optimizers modify the weights of the network given the gradients and, maybe, additional information, depending on the type of the optimizer. Most optimizers are based on gradient descent, the approach of iteratively decreasing the loss function by following the gradient. Gradient descent can be as simple as subtracting the gradients from the weights, or can also be very sophisticated. The choice of optimizer can dramatically influence the performance of the model.
In this notebook, we are going to compare (i) a simple optimizer but still powerful and widely used, which is named Stochastic Gradient Descent (SGD), and (ii) a more recent one, which is used in many recent research papers, whose name is Adaptive Moment Estimation (Adam). The difference is that SGD subtracts the gradient multiplied by the same learning rate from the weights, while Adam computes adaptive learning rates for each parameter. Although theoretically more powerful, Adam introduces two new hyperparameters that complicate the hyperparameter tuning problem.
This notebook represents an extension over the previous Eden notebooks:
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import cv2
import os
import csv
import gc
import random
import matplotlib.pyplot as plt
from tqdm import tqdm
from glob import glob
from pathlib import Path
import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.applications import *
from tensorflow.keras import layers
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.callbacks import ReduceLROnPlateau
import tensorflow.keras.backend as K
from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle
from tensorflow.keras.layers.experimental import preprocessing
from tensorflow.keras.models import Sequential
from tensorflow.keras.applications.mobilenet_v3 import preprocess_input
Check the docstrings for more information.
# Function for plotting images.
def plot_sample(X):
"""
Given the array of images <X>, it plots a random subsample of 25 images.
Parameters:
X (ndarray): The array with all the images.
"""
# Plotting 9 sample images
nb_rows = 3
nb_cols = 3
fig, axs = plt.subplots(nb_rows, nb_cols, figsize=(6, 6))
for i in range(0, nb_rows):
for j in range(0, nb_cols):
axs[i, j].xaxis.set_ticklabels([])
axs[i, j].yaxis.set_ticklabels([])
axs[i, j].imshow(X[random.randint(0, X.shape[0] - 1)])
def read_data(path_list, im_size=(224, 224)):
"""
Given the list of paths where the images are stored <path_list>,
and the size for image decimation <im_size>, it returns 2 Numpy Arrays
with the images and labels; and a dictionary with the mapping between
classes and folders. This will be used later for displaying the predicted
labels.
Parameters:
path_list (List[String]): The list of paths to the images.
im_size (Tuple): The height and width values.
Returns:
X (ndarray): Images
y (ndarray): Labels
tag2idx (dict): Map between labels and folders.
"""
X = []
y = []
# Exctract the file-names of the datasets we read and create a label dictionary.
tag2idx = {tag.split(os.path.sep)[-1]: i for i, tag in enumerate(path_list)}
for path in path_list:
for im_file in tqdm(glob(path + "*/*")): # Read all files in path
try:
# os.path.separator is OS agnostic (either '/' or '\'),[-2] to grab folder name.
label = im_file.split(os.path.sep)[-2]
im = cv2.imread(im_file, cv2.IMREAD_COLOR)
# By default OpenCV read with BGR format, return back to RGB.
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
# Resize to appropriate dimensions.You can try different interpolation methods.
# im = quantize_image(im)
im = cv2.resize(im, im_size, interpolation=cv2.INTER_AREA)
X.append(im)
y.append(tag2idx[label]) # Append the label name to y
except Exception as e:
# In case annotations or metadata are found
print("Not a picture")
X = np.array(X) # Convert list to numpy array.
y = np.eye(len(np.unique(y)))[y].astype(np.uint8)
return X, y
# Callbacks are used for saving the best weights and
# early stopping.
def get_callbacks(weights_file, patience):
"""
Callbacks are used for saving the best weights and early stopping.
Given some configuration parameters, it creates the callbacks that
will be used by Keras after each epoch.
Parameters:
weights_file (String): File name for saving the best model weights.
patience (Integer): Number of epochs without improvement to wait.
Returns:
callbacks (List[Callbacks]): Configured callbacks ready to use.
"""
return [
# Only save the weights that correspond to the maximum validation accuracy.
ModelCheckpoint(
filepath=weights_file,
monitor="val_accuracy",
mode="max",
save_best_only=True,
save_weights_only=True,
),
# If val_loss doesn't improve for a number of epochs set with 'patience' var
# training will stop to avoid overfitting.
EarlyStopping(monitor="val_loss", mode="min", patience=patience, verbose=1),
]
# Plot learning curves for both validation accuracy & loss,
# training accuracy & loss
def plot_performances(performances):
"""
Given the list of performances (validation accuracies) and method-name <performances>,
it plots how the validation accuracy progressed during the training/validation process.
Parameters:
performances (List[Tuple]): The list of method-performance tuples.
"""
plt.figure(figsize=(14, 8))
plt.title("Validation Accuracy vs. Number of Training Epochs")
plt.xlabel("Training Epochs")
plt.ylabel("Validation Accuracy")
for performance in performances:
plt.plot(
range(1, len(performance[1]) + 1), performance[1], label=performance[0]
)
plt.ylim((0.25, 1.05))
plt.xticks(np.arange(1, NUM_EPOCHS + 1, 1.0))
plt.legend()
plt.show()
INPUT_SHAPE = (224, 224, 3)
IM_SIZE = (224, 224)
NUM_EPOCHS = 30
BATCH_SIZE = 4
TEST_SPLIT = 0.2
VAL_SPLIT = 0.2
RANDOM_STATE = 2021
WEIGHTS_FILE = "weights.h5" # File that stores updated weights
# Datasets' paths we want to work on.
PATH_LIST = [
"eden_library_datasets/Orange tree-060521-K deficiency-zz-V1-20210721140920",
"eden_library_datasets/Orange tree-060521-Mg deficiency-zz-V1-20210721140926",
]
tf.random.set_seed(RANDOM_STATE)
np.random.seed(RANDOM_STATE)
i = 0
for path in PATH_LIST:
# Define paths in an OS agnostic way.
PATH_LIST[i] = str(Path(Path.cwd()).parents[0].joinpath(path))
i += 1
X, y = read_data(PATH_LIST, IM_SIZE)
100%|██████████| 33/33 [00:03<00:00, 8.99it/s] 100%|██████████| 45/45 [00:05<00:00, 8.93it/s]
plot_sample(X)
img_augmentation = Sequential(
[
preprocessing.RandomRotation(factor=0.15),
preprocessing.RandomTranslation(height_factor=0.1, width_factor=0.1),
preprocessing.RandomFlip(),
preprocessing.RandomContrast(factor=0.1),
],
name="img_augmentation",
)
IMAGE_IX = 10
image = tf.expand_dims(X[IMAGE_IX], axis=0)
plt.figure(figsize=(8, 8))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
aug_img = img_augmentation(image)
plt.imshow(aug_img[0].numpy().astype("uint8"))
plt.axis("off")
plt.show()
def get_architecture(y, mobilenet_size, optimizer, learning_rate):
"""
Given the parameters, it returns a compiled architecture (MobileNetV3)
ready for training.
"""
inputs = layers.Input(shape=INPUT_SHAPE)
input_aug = img_augmentation(inputs)
input_norm = layers.Lambda(preprocess_input)(input_aug) # placeholder in this case
if mobilenet_size == "small":
feature_extractor = MobileNetV3Small(
weights="imagenet", include_top=False, input_tensor=input_norm
)
elif mobilenet_size == "large":
feature_extractor = MobileNetV3Large(
weights="imagenet", include_top=False, input_tensor=input_norm
)
# Create new model on top.
feataures = layers.GlobalAveragePooling2D(name="pool")(
feature_extractor.output
) # Flattening layer.
fully = layers.Dense(units=64, activation="relu")(
feataures
) # Add a fully connected layer.
# Create a Classifier with shape=number_of_training_classes.
fully = layers.Dropout(0.3)(fully) # Regularize with dropout.
out = layers.Dense(units=y.shape[1], activation="softmax")(fully)
# This is the final model.
model = Model(inputs, out)
# Defining a base learning rate for optimizer.
base_learning_rate = learning_rate
if optimizer == "adam":
optimizer = tf.keras.optimizers.Adam(lr=base_learning_rate)
elif optimizer == "sgd":
optimizer = tf.keras.optimizers.SGD(lr=base_learning_rate)
else:
print("[ERROR] Unknown optimizer")
model.compile(
loss="categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"]
)
# model.summary()
return model
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=TEST_SPLIT, shuffle=True, stratify=y, random_state=RANDOM_STATE
)
X_train, X_val, y_train, y_val = train_test_split(
X_train,
y_train,
test_size=VAL_SPLIT,
shuffle=True,
stratify=y_train,
random_state=RANDOM_STATE,
)
%%time
model = get_architecture(
y, mobilenet_size="small", optimizer="adam", learning_rate=1e-2
)
history_v3Small_adam_lr2 = model.fit(
X_train, # train data
y_train, # labels
batch_size=BATCH_SIZE,
epochs=NUM_EPOCHS,
validation_data=(X_val, y_val),
callbacks=get_callbacks(WEIGHTS_FILE, NUM_EPOCHS // 2),
)
Epoch 1/30 13/13 [==============================] - 6s 94ms/step - loss: 5.3706 - accuracy: 0.5510 - val_loss: 70998.1484 - val_accuracy: 0.6154 Epoch 2/30 13/13 [==============================] - 0s 34ms/step - loss: 1.6461 - accuracy: 0.7143 - val_loss: 538068.8750 - val_accuracy: 0.6154 Epoch 3/30 13/13 [==============================] - 0s 36ms/step - loss: 0.9148 - accuracy: 0.6327 - val_loss: 50255.5859 - val_accuracy: 0.3846 Epoch 4/30 13/13 [==============================] - 0s 33ms/step - loss: 0.6228 - accuracy: 0.7959 - val_loss: 488808.3438 - val_accuracy: 0.3846 Epoch 5/30 13/13 [==============================] - 0s 33ms/step - loss: 0.8103 - accuracy: 0.7755 - val_loss: 69293.0703 - val_accuracy: 0.3846 Epoch 6/30 13/13 [==============================] - 0s 33ms/step - loss: 0.6496 - accuracy: 0.7347 - val_loss: 106808.7500 - val_accuracy: 0.6154 Epoch 7/30 13/13 [==============================] - 0s 33ms/step - loss: 0.7512 - accuracy: 0.7143 - val_loss: 2162908.2500 - val_accuracy: 0.6154 Epoch 8/30 13/13 [==============================] - 0s 35ms/step - loss: 0.7549 - accuracy: 0.6531 - val_loss: 900750.6875 - val_accuracy: 0.6154 Epoch 9/30 13/13 [==============================] - 0s 33ms/step - loss: 0.7646 - accuracy: 0.5714 - val_loss: 139805.9062 - val_accuracy: 0.6154 Epoch 10/30 13/13 [==============================] - 0s 33ms/step - loss: 0.8594 - accuracy: 0.5918 - val_loss: 27938.8613 - val_accuracy: 0.6154 Epoch 11/30 13/13 [==============================] - 0s 33ms/step - loss: 0.6589 - accuracy: 0.6531 - val_loss: 62696.4570 - val_accuracy: 0.6154 Epoch 12/30 13/13 [==============================] - 0s 34ms/step - loss: 0.5574 - accuracy: 0.7347 - val_loss: 8127.2100 - val_accuracy: 0.6154 Epoch 13/30 13/13 [==============================] - 0s 33ms/step - loss: 0.6501 - accuracy: 0.6531 - val_loss: 10868.7324 - val_accuracy: 0.6154 Epoch 14/30 13/13 [==============================] - 0s 32ms/step - loss: 0.4754 - accuracy: 0.7347 - val_loss: 22005.0078 - val_accuracy: 0.3846 Epoch 15/30 13/13 [==============================] - 0s 33ms/step - loss: 0.3969 - accuracy: 0.7755 - val_loss: 38400.6406 - val_accuracy: 0.3846 Epoch 16/30 13/13 [==============================] - 0s 32ms/step - loss: 0.5814 - accuracy: 0.6939 - val_loss: 6112.3472 - val_accuracy: 0.3846 Epoch 17/30 13/13 [==============================] - 0s 33ms/step - loss: 0.5296 - accuracy: 0.6327 - val_loss: 5191.4580 - val_accuracy: 0.6154 Epoch 18/30 13/13 [==============================] - 0s 34ms/step - loss: 0.4604 - accuracy: 0.7143 - val_loss: 7304.4082 - val_accuracy: 0.6154 Epoch 19/30 13/13 [==============================] - 0s 32ms/step - loss: 0.7967 - accuracy: 0.7755 - val_loss: 9301.0537 - val_accuracy: 0.3846 Epoch 20/30 13/13 [==============================] - 0s 35ms/step - loss: 0.6612 - accuracy: 0.7347 - val_loss: 874.1976 - val_accuracy: 0.6154 Epoch 21/30 13/13 [==============================] - 0s 32ms/step - loss: 0.5306 - accuracy: 0.7143 - val_loss: 206.9842 - val_accuracy: 0.3846 Epoch 22/30 13/13 [==============================] - 0s 36ms/step - loss: 0.6017 - accuracy: 0.7143 - val_loss: 4828.0708 - val_accuracy: 0.3846 Epoch 23/30 13/13 [==============================] - 0s 34ms/step - loss: 0.6089 - accuracy: 0.8163 - val_loss: 9714.1416 - val_accuracy: 0.3846 Epoch 24/30 13/13 [==============================] - 0s 32ms/step - loss: 0.4438 - accuracy: 0.7959 - val_loss: 6646.1982 - val_accuracy: 0.3846 Epoch 25/30 13/13 [==============================] - 0s 33ms/step - loss: 0.8701 - accuracy: 0.6531 - val_loss: 4663.7910 - val_accuracy: 0.3846 Epoch 26/30 13/13 [==============================] - 0s 33ms/step - loss: 0.5754 - accuracy: 0.6531 - val_loss: 3429.3396 - val_accuracy: 0.3846 Epoch 27/30 13/13 [==============================] - 0s 33ms/step - loss: 0.5790 - accuracy: 0.7143 - val_loss: 856022.5625 - val_accuracy: 0.6154 Epoch 28/30 13/13 [==============================] - 0s 33ms/step - loss: 0.7522 - accuracy: 0.7755 - val_loss: 402125.3750 - val_accuracy: 0.6154 Epoch 29/30 13/13 [==============================] - 0s 34ms/step - loss: 0.6269 - accuracy: 0.6939 - val_loss: 22640.5391 - val_accuracy: 0.6154 Epoch 30/30 13/13 [==============================] - 0s 34ms/step - loss: 0.5017 - accuracy: 0.7551 - val_loss: 989.4794 - val_accuracy: 0.6923 CPU times: user 22.2 s, sys: 545 ms, total: 22.8 s Wall time: 19.8 s
model.load_weights(WEIGHTS_FILE)
final_accuracy_adam_lr2 = model.evaluate(X_test, y_test, batch_size=1, verbose=0)[1]
print("*" * 50)
print(f"Final MobileNetV3-Small-Adam-LR=0.01 Accuracy: {final_accuracy_adam_lr2}")
print("*" * 50)
print()
************************************************** Final MobileNetV3-Small-Adam-LR=0.01 Accuracy: 0.875 **************************************************
model = get_architecture(
y, mobilenet_size="small", optimizer="adam", learning_rate=1e-4
)
history_v3Small_adam_lr4 = model.fit(
X_train, # train data
y_train, # labels
batch_size=BATCH_SIZE,
epochs=NUM_EPOCHS,
validation_data=(X_val, y_val),
callbacks=get_callbacks(WEIGHTS_FILE, NUM_EPOCHS // 2),
)
Epoch 1/30 13/13 [==============================] - 5s 89ms/step - loss: 0.9737 - accuracy: 0.4286 - val_loss: 0.7421 - val_accuracy: 0.6154 Epoch 2/30 13/13 [==============================] - 0s 33ms/step - loss: 0.5811 - accuracy: 0.7143 - val_loss: 0.7039 - val_accuracy: 0.6154 Epoch 3/30 13/13 [==============================] - 0s 35ms/step - loss: 0.5016 - accuracy: 0.7959 - val_loss: 0.7204 - val_accuracy: 0.6154 Epoch 4/30 13/13 [==============================] - 0s 34ms/step - loss: 0.5790 - accuracy: 0.6531 - val_loss: 0.7669 - val_accuracy: 0.6154 Epoch 5/30 13/13 [==============================] - 0s 34ms/step - loss: 0.5879 - accuracy: 0.6531 - val_loss: 0.7830 - val_accuracy: 0.6154 Epoch 6/30 13/13 [==============================] - 0s 35ms/step - loss: 0.4221 - accuracy: 0.8367 - val_loss: 0.8298 - val_accuracy: 0.6154 Epoch 7/30 13/13 [==============================] - 0s 34ms/step - loss: 0.3777 - accuracy: 0.8571 - val_loss: 0.7691 - val_accuracy: 0.6154 Epoch 8/30 13/13 [==============================] - 0s 35ms/step - loss: 0.4837 - accuracy: 0.7959 - val_loss: 0.8205 - val_accuracy: 0.6154 Epoch 9/30 13/13 [==============================] - 0s 35ms/step - loss: 0.4064 - accuracy: 0.7959 - val_loss: 0.7369 - val_accuracy: 0.6154 Epoch 10/30 13/13 [==============================] - 0s 36ms/step - loss: 0.4933 - accuracy: 0.7347 - val_loss: 0.7136 - val_accuracy: 0.6154 Epoch 11/30 13/13 [==============================] - 0s 35ms/step - loss: 0.3164 - accuracy: 0.8571 - val_loss: 0.9233 - val_accuracy: 0.6154 Epoch 12/30 13/13 [==============================] - 0s 34ms/step - loss: 0.4070 - accuracy: 0.7347 - val_loss: 0.7231 - val_accuracy: 0.6923 Epoch 13/30 13/13 [==============================] - 0s 34ms/step - loss: 0.3823 - accuracy: 0.7959 - val_loss: 0.6383 - val_accuracy: 0.6923 Epoch 14/30 13/13 [==============================] - 0s 34ms/step - loss: 0.2263 - accuracy: 0.9184 - val_loss: 0.6776 - val_accuracy: 0.6923 Epoch 15/30 13/13 [==============================] - 0s 36ms/step - loss: 0.2528 - accuracy: 0.9184 - val_loss: 0.6180 - val_accuracy: 0.6923 Epoch 16/30 13/13 [==============================] - 0s 33ms/step - loss: 0.4644 - accuracy: 0.7755 - val_loss: 0.6605 - val_accuracy: 0.6923 Epoch 17/30 13/13 [==============================] - 0s 37ms/step - loss: 0.3330 - accuracy: 0.8776 - val_loss: 0.6676 - val_accuracy: 0.6923 Epoch 18/30 13/13 [==============================] - 0s 32ms/step - loss: 0.3101 - accuracy: 0.8571 - val_loss: 0.5826 - val_accuracy: 0.6923 Epoch 19/30 13/13 [==============================] - 0s 35ms/step - loss: 0.3108 - accuracy: 0.8367 - val_loss: 0.6251 - val_accuracy: 0.6923 Epoch 20/30 13/13 [==============================] - 0s 33ms/step - loss: 0.1666 - accuracy: 0.9592 - val_loss: 0.7515 - val_accuracy: 0.6923 Epoch 21/30 13/13 [==============================] - 0s 32ms/step - loss: 0.1532 - accuracy: 0.9592 - val_loss: 0.7727 - val_accuracy: 0.6923 Epoch 22/30 13/13 [==============================] - 0s 34ms/step - loss: 0.2458 - accuracy: 0.9388 - val_loss: 0.6166 - val_accuracy: 0.7692 Epoch 23/30 13/13 [==============================] - 0s 33ms/step - loss: 0.1912 - accuracy: 0.8776 - val_loss: 0.5796 - val_accuracy: 0.7692 Epoch 24/30 13/13 [==============================] - 0s 36ms/step - loss: 0.2315 - accuracy: 0.9184 - val_loss: 0.5489 - val_accuracy: 0.7692 Epoch 25/30 13/13 [==============================] - 0s 33ms/step - loss: 0.4106 - accuracy: 0.7959 - val_loss: 0.5466 - val_accuracy: 0.7692 Epoch 26/30 13/13 [==============================] - 0s 35ms/step - loss: 0.4623 - accuracy: 0.8163 - val_loss: 0.4529 - val_accuracy: 0.6923 Epoch 27/30 13/13 [==============================] - 0s 34ms/step - loss: 0.1669 - accuracy: 0.9388 - val_loss: 0.6664 - val_accuracy: 0.7692 Epoch 28/30 13/13 [==============================] - 0s 35ms/step - loss: 0.2568 - accuracy: 0.9388 - val_loss: 0.5678 - val_accuracy: 0.7692 Epoch 29/30 13/13 [==============================] - 0s 32ms/step - loss: 0.1249 - accuracy: 0.9184 - val_loss: 0.4970 - val_accuracy: 0.7692 Epoch 30/30 13/13 [==============================] - 0s 33ms/step - loss: 0.1437 - accuracy: 0.9388 - val_loss: 0.3937 - val_accuracy: 0.6923
model.load_weights(WEIGHTS_FILE)
final_accuracy_adam_lr4 = model.evaluate(X_test, y_test, batch_size=1, verbose=0)[1]
print("*" * 50)
print(f"Final MobileNetV3-Small-Adam-LR=0.0001 Accuracy: {final_accuracy_adam_lr4}")
print("*" * 50)
print()
************************************************** Final MobileNetV3-Small-Adam-LR=0.0001 Accuracy: 0.5625 **************************************************
%%time
model = get_architecture(y, mobilenet_size="large", optimizer="sgd", learning_rate=1e-2)
history_v3Small_sgd_lr2 = model.fit(
X_train, # train data
y_train, # labels
batch_size=BATCH_SIZE,
epochs=NUM_EPOCHS,
validation_data=(X_val, y_val),
callbacks=get_callbacks(WEIGHTS_FILE, NUM_EPOCHS // 2),
)
Epoch 1/30 13/13 [==============================] - 6s 106ms/step - loss: 0.8922 - accuracy: 0.4490 - val_loss: 0.6835 - val_accuracy: 0.7692 Epoch 2/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7667 - accuracy: 0.4898 - val_loss: 0.7083 - val_accuracy: 0.6154 Epoch 3/30 13/13 [==============================] - 1s 39ms/step - loss: 0.6399 - accuracy: 0.6327 - val_loss: 0.7257 - val_accuracy: 0.3846 Epoch 4/30 13/13 [==============================] - 0s 37ms/step - loss: 0.5770 - accuracy: 0.6939 - val_loss: 1.1846 - val_accuracy: 0.3846 Epoch 5/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7093 - accuracy: 0.5510 - val_loss: 0.7485 - val_accuracy: 0.3846 Epoch 6/30 13/13 [==============================] - 0s 39ms/step - loss: 0.5875 - accuracy: 0.6531 - val_loss: 0.6181 - val_accuracy: 0.6154 Epoch 7/30 13/13 [==============================] - 1s 42ms/step - loss: 0.5077 - accuracy: 0.7551 - val_loss: 0.7618 - val_accuracy: 0.3846 Epoch 8/30 13/13 [==============================] - 0s 39ms/step - loss: 0.5855 - accuracy: 0.6531 - val_loss: 0.5877 - val_accuracy: 0.6154 Epoch 9/30 13/13 [==============================] - 1s 41ms/step - loss: 0.5393 - accuracy: 0.6735 - val_loss: 0.9562 - val_accuracy: 0.3846 Epoch 10/30 13/13 [==============================] - 0s 38ms/step - loss: 0.5361 - accuracy: 0.7347 - val_loss: 0.5759 - val_accuracy: 0.6154 Epoch 11/30 13/13 [==============================] - 1s 41ms/step - loss: 0.4255 - accuracy: 0.7755 - val_loss: 0.6245 - val_accuracy: 0.6154 Epoch 12/30 13/13 [==============================] - 1s 40ms/step - loss: 0.5750 - accuracy: 0.6939 - val_loss: 0.4907 - val_accuracy: 0.7692 Epoch 13/30 13/13 [==============================] - 1s 40ms/step - loss: 0.5724 - accuracy: 0.7143 - val_loss: 0.5832 - val_accuracy: 0.6154 Epoch 14/30 13/13 [==============================] - 1s 39ms/step - loss: 0.3895 - accuracy: 0.8163 - val_loss: 0.5680 - val_accuracy: 0.6923 Epoch 15/30 13/13 [==============================] - 1s 41ms/step - loss: 0.2948 - accuracy: 0.8980 - val_loss: 0.5892 - val_accuracy: 0.6923 Epoch 16/30 13/13 [==============================] - 1s 39ms/step - loss: 0.5577 - accuracy: 0.7755 - val_loss: 0.4947 - val_accuracy: 0.6923 Epoch 17/30 13/13 [==============================] - 1s 42ms/step - loss: 0.3923 - accuracy: 0.8367 - val_loss: 0.7862 - val_accuracy: 0.5385 Epoch 18/30 13/13 [==============================] - 0s 38ms/step - loss: 0.4386 - accuracy: 0.8571 - val_loss: 0.4669 - val_accuracy: 0.6923 Epoch 19/30 13/13 [==============================] - 1s 42ms/step - loss: 0.4503 - accuracy: 0.7959 - val_loss: 0.4751 - val_accuracy: 0.7692 Epoch 20/30 13/13 [==============================] - 1s 39ms/step - loss: 0.2886 - accuracy: 0.8776 - val_loss: 0.3366 - val_accuracy: 0.9231 Epoch 21/30 13/13 [==============================] - 0s 38ms/step - loss: 0.1710 - accuracy: 0.9388 - val_loss: 0.3230 - val_accuracy: 0.9231 Epoch 22/30 13/13 [==============================] - 0s 38ms/step - loss: 0.4173 - accuracy: 0.7755 - val_loss: 0.3726 - val_accuracy: 0.8462 Epoch 23/30 13/13 [==============================] - 1s 41ms/step - loss: 0.3834 - accuracy: 0.7959 - val_loss: 0.6980 - val_accuracy: 0.6154 Epoch 24/30 13/13 [==============================] - 1s 40ms/step - loss: 0.2447 - accuracy: 0.9184 - val_loss: 0.5820 - val_accuracy: 0.7692 Epoch 25/30 13/13 [==============================] - 1s 42ms/step - loss: 0.5944 - accuracy: 0.7347 - val_loss: 0.3574 - val_accuracy: 0.8462 Epoch 26/30 13/13 [==============================] - 1s 39ms/step - loss: 0.3724 - accuracy: 0.7959 - val_loss: 0.1567 - val_accuracy: 0.9231 Epoch 27/30 13/13 [==============================] - 1s 41ms/step - loss: 0.2400 - accuracy: 0.8980 - val_loss: 0.2856 - val_accuracy: 0.9231 Epoch 28/30 13/13 [==============================] - 1s 39ms/step - loss: 0.2065 - accuracy: 0.9388 - val_loss: 0.5781 - val_accuracy: 0.7692 Epoch 29/30 13/13 [==============================] - 0s 38ms/step - loss: 0.2650 - accuracy: 0.8571 - val_loss: 0.0533 - val_accuracy: 1.0000 Epoch 30/30 13/13 [==============================] - 0s 39ms/step - loss: 0.3304 - accuracy: 0.8980 - val_loss: 0.0971 - val_accuracy: 1.0000 CPU times: user 24.6 s, sys: 559 ms, total: 25.1 s Wall time: 22.5 s
model.load_weights(WEIGHTS_FILE)
final_accuracy_sgd_lr2 = model.evaluate(X_test, y_test, batch_size=1, verbose=0)[1]
print("*" * 50)
print(f"Final MobileNetV3-Small-SGD-LR=0.01 Accuracy: {final_accuracy_sgd_lr2}")
print("*" * 50)
print()
************************************************** Final MobileNetV3-Small-SGD-LR=0.01 Accuracy: 0.875 **************************************************
model = get_architecture(y, mobilenet_size="large", optimizer="sgd", learning_rate=1e-4)
history_v3Small_sgd_lr4 = model.fit(
X_train, # train data
y_train, # labels
batch_size=BATCH_SIZE,
epochs=NUM_EPOCHS,
validation_data=(X_val, y_val),
callbacks=get_callbacks(WEIGHTS_FILE, NUM_EPOCHS // 2),
)
Epoch 1/30 13/13 [==============================] - 5s 106ms/step - loss: 0.6751 - accuracy: 0.5102 - val_loss: 0.6752 - val_accuracy: 0.6154 Epoch 2/30 13/13 [==============================] - 1s 40ms/step - loss: 0.7291 - accuracy: 0.5102 - val_loss: 0.6809 - val_accuracy: 0.5385 Epoch 3/30 13/13 [==============================] - 1s 40ms/step - loss: 0.8014 - accuracy: 0.3878 - val_loss: 0.6816 - val_accuracy: 0.5385 Epoch 4/30 13/13 [==============================] - 0s 38ms/step - loss: 0.7443 - accuracy: 0.5510 - val_loss: 0.6850 - val_accuracy: 0.5385 Epoch 5/30 13/13 [==============================] - 0s 39ms/step - loss: 0.7189 - accuracy: 0.4898 - val_loss: 0.6864 - val_accuracy: 0.5385 Epoch 6/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7186 - accuracy: 0.5918 - val_loss: 0.6854 - val_accuracy: 0.5385 Epoch 7/30 13/13 [==============================] - 1s 42ms/step - loss: 0.6874 - accuracy: 0.5510 - val_loss: 0.6851 - val_accuracy: 0.5385 Epoch 8/30 13/13 [==============================] - 0s 38ms/step - loss: 0.7646 - accuracy: 0.5510 - val_loss: 0.6897 - val_accuracy: 0.6923 Epoch 9/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7177 - accuracy: 0.5714 - val_loss: 0.6877 - val_accuracy: 0.6154 Epoch 10/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7647 - accuracy: 0.4694 - val_loss: 0.6906 - val_accuracy: 0.6923 Epoch 11/30 13/13 [==============================] - 1s 43ms/step - loss: 0.7521 - accuracy: 0.4898 - val_loss: 0.6935 - val_accuracy: 0.6923 Epoch 12/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7728 - accuracy: 0.4694 - val_loss: 0.6901 - val_accuracy: 0.6923 Epoch 13/30 13/13 [==============================] - 0s 39ms/step - loss: 0.7052 - accuracy: 0.5510 - val_loss: 0.6905 - val_accuracy: 0.6923 Epoch 14/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7443 - accuracy: 0.4898 - val_loss: 0.6914 - val_accuracy: 0.6923 Epoch 15/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7330 - accuracy: 0.5102 - val_loss: 0.6918 - val_accuracy: 0.6923 Epoch 16/30 13/13 [==============================] - 1s 39ms/step - loss: 0.7252 - accuracy: 0.5102 - val_loss: 0.6935 - val_accuracy: 0.6923 Epoch 00016: early stopping
model.load_weights(WEIGHTS_FILE)
final_accuracy_sgd_lr4 = model.evaluate(X_test, y_test, batch_size=1, verbose=0)[1]
print("*" * 50)
print(f"Final MobileNetV3-Small-SGD-LR=0.0001 Accuracy: {final_accuracy_sgd_lr4}")
print("*" * 50)
print()
************************************************** Final MobileNetV3-Small-SGD-LR=0.0001 Accuracy: 0.375 **************************************************
# Preparing performances for being plotted
performances = [
(
f"Adam-LR=0.01 ({round(final_accuracy_adam_lr2, 3)})",
history_v3Small_adam_lr2.history["val_accuracy"],
),
(
f"Adam-LR=0.0001 ({round(final_accuracy_adam_lr4, 3)})",
history_v3Small_adam_lr4.history["val_accuracy"],
),
(
f"SGD-LR=0.01 ({round(final_accuracy_sgd_lr2, 3)})",
history_v3Small_sgd_lr2.history["val_accuracy"],
),
(
f"SGD-LR=0.0001 ({round(final_accuracy_sgd_lr4, 3)})",
history_v3Small_sgd_lr4.history["val_accuracy"],
),
]
plot_performances(performances)
The correct selection of optimizer and learning rate makes a real difference in performance.
Bengio, Y., 2012. Deep Learning of Representations for Unsupervised and Transfer Learning. In: Journal of Machine Learning Research; 17–37.
Wang, G., Sun, Y., Wang, J., (2017). Automatic Image-Based Plant Disease Severity Estimation Using Deep Learning. Computational Intelligence and Neuroscience; 2017:8.
Mehdipour-Ghazi, M., Yanikoglu, B.A., & Aptoula, E. (2017). Plant identification using deep neural networks via optimization of transfer learning parameters. Neurocomputing, 235, 228-235.
Suh, H.K., IJsselmuiden, J., Hofstee, J.W., van Henten, E.J., (2018). Transfer learning for the classification of sugar beet and volunteer potato under field conditions. Biosystems Engineering; 174:50–65.
Kounalakis T., Triantafyllidis G. A., Nalpantidis L., (2019). Deep learning-based visual recognition of rumex for robotic precision farming. Computers and Electronics in Agriculture.
Too, E.C., Yujian, L., Njuki, S., & Ying-chun, L. (2019). A comparative study of fine-tuning deep learning models for plant disease identification. Comput. Electron. Agric., 161, 272-279.
Espejo-Garcia, B., Mylonas, N., Athanasakos, L., & Fountas, S., (2020). Improving Weeds Identification with a Repository of Agricultural Pre-trained Deep Neural Networks. Computers and Electronics in Agriculture; 175 (August).
Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., & Chen, L. (2018). MobileNetV2: Inverted Residuals and Linear Bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4510-4520.
Howard, A.G., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., & Adam, H. (2019). Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 1314-1324.
https://medium.com/geekculture/a-2021-guide-to-improving-cnns-optimizers-adam-vs-sgd-495848ac6008