Notebook

Weeds Identification-AutoML-3¶

Instructions¶

To run any of Eden's notebooks, please check the guides on our Wiki page.
There you will find instructions on how to deploy the notebooks on your local system, on Google Colab, or on MyBinder, as well as other useful links, troubleshooting tips, and more.
For this notebook you will need to download the Cotton-100619-Healthy-zz-V1-20210225102300, Velvet leaf-220519-Weed-zz-V1-20210225104123, Tomato-240519-Healthy-zz-V1-20210225103740 and Black nightsade-220519-Weed-zz-V1-20210225102034 datasets from Eden Library, and you may want to use the eden_autokeras.yml file to recreate a suitable conda environment.

Note: If you find any issues while executing the notebook, don't hesitate to open an issue on Github. We will try to reply as soon as possible.

Background¶

In this notebook, we are going to cover a technique called Automated Machine Learning (AutoML). These systems have arisen in the past years to allow computers to automatically find the most suitable Machine Learning (ML) pipeline matching a specific task and dataset. AutoML systems could provide insights to ML engineers resulting in better models deployed in a shorter period of time. They are considered meta-level ML algorithms which use different components as building blocks for finding the optimal ML pipeline structures (Feurer et al., 2015; Kotthoff et al., 2017). These systems automatically evaluate multiple pipeline configurations, trying to improve the global performance iteratively. Several open-source technologies have raised awareness of the strengths and limitations of the AutoML systems, e.g. AutoKeras, AutoSklearn, Auto-WEKA, H2O AutoML, TPOT, autoxgboost, and OBOE. Additionally, different AutoML cloud-solutions are now being offered by IT firms, such as Google Cloud AutoML Vision, Microsoft Azure Machine Learning, and Apple Create ML; they offer user-friendly interfaces and require little expertise in Machine Learning to train models. In this notebook, we will be using AutoKeras. If you want to check out AutoSklearn, please have a look in our previous notebooks:

In the agricultural domain, some recent research studies have made use of the AutoML technique in the past few years, using it to process time series as well as proximal and satellite images. In Hayashi et al., (2019), the authors tested whether AutoML was a useful tool for the identification of pest insect species by using three aphid species. They constructed models that were trained by photographs of those species under various conditions in Google Cloud AutoML Vision, and compared their accuracies of identification. Since the rates of correct identification were over 96% when the models were trained with 400 images per class, they considered AutoML to be useful for pest species identification. In Montellano (2019), the author used AutoML through the same platform to classify different types of butterflies, image fruits, and larval host plants; their average accuracy was around 97.1%. In Hsieh et al., (2019), AutoML was implemented along with neural network algorithms to classify whether the conditions of rice blast disease were exacerbated or relieved by using five years of climatic data. Although the experiments showed 72% accuracy on average, the model obtained an accuracy of 89% in the exacerbation case. Hence, the effectiveness of the proposed classification model, which combined multiple machine learning models, was confirmed. Finally, an AutoML approach has been applied in Kiala et al., (2020), in an attempt to map the Parthenium weed. The authors constructed models by using AutoML technology and 16 other classifiers that were trained by satellite pictures of Sentinel-2 and Landsat 8. The AutoML model achieved a higher overall classification accuracy of 88.15% using Sentinel-2 and 74% using Landsat 8, results that confirmed the significance of the AutoML in mapping Parthenium weed infestations using satellite imagery. In Koh et al., (2020), authors used wheat lodging assessment with UAV images for high-throughput plant phenotyping. They compared AutoKeras in image classification and regression tasks to transfer learning techniques. Finally, in Espejo-Garcia et. al, (2021), the authors integrated AutoSklearn and AutoKeras for addressing the problem of weed identification in 2 different datasets.

In agriculture, since weeds compete with crops in the domain of space, light, and nutrients, they are an important problem that can lead to a poorer harvest. To avoid this, weeds should be removed at every step of growth, but especially at the initial stages. For that reason, identifying weeds accurately through Deep Learning has arisen as an important objective.

Library Imports¶

In [2]:

import numpy as np
import matplotlib.pyplot as plt
import random
import os
from glob import glob
from tqdm import tqdm
import cv2
from pathlib import Path

import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.callbacks import ReduceLROnPlateau

from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score

import autokeras as ak
from tensorflow.keras.models import load_model

Check whether GPU device is available for training¶

In [3]:

from tensorflow.python.client import device_lib

print(device_lib.list_local_devices())

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 9690741608741746477
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 7113641824
locality {
  bus_id: 1
  links {
  }
}
incarnation: 15558626628765919098
physical_device_desc: "device: 0, name: GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5"
]

Auxiliar functions¶

In [4]:

# Callbacks are used for saving the best weights and early stopping


def get_callbacks(weights_file, patience):
    return [
        # Only save the weights that correspond to the maximum validation accuracy
        ModelCheckpoint(
            filepath=weights_file,
            monitor="val_accuracy",
            mode="max",
            save_best_only=True,
            save_weights_only=True,
        ),
        # If val_loss doesn't improve for a number of epochs set with 'patience' variable
        # training will stop to avoid overfitting
        EarlyStopping(monitor="val_accuracy", mode="max", patience=patience, verbose=1),
    ]

In [5]:

def plot_sample(X):
    nb_rows = 3
    nb_cols = 3
    fig, axs = plt.subplots(nb_rows, nb_cols, figsize=(6, 6))

    for i in range(0, nb_rows):
        for j in range(0, nb_cols):
            axs[i, j].xaxis.set_ticklabels([])
            axs[i, j].yaxis.set_ticklabels([])
            axs[i, j].imshow(X[random.randint(0, X.shape[0] - 1)])

In [6]:

def read_data(path_list, im_size=(224, 224)):

    X = []
    y = []

    # Exctract the file-names of the datasets we read and create a label dictionary.
    tag2idx = {tag.split(os.path.sep)[-1]: i for i, tag in enumerate(path_list)}

    for path in path_list:
        for im_file in tqdm(glob(path + "*/*")):  # Read all files in path
            try:
                # os.path.separator is OS agnostic (either '/' or '\'),[-2] to grab folder name.
                label = im_file.split(os.path.sep)[-2]
                im = cv2.imread(im_file)
                # Resize to appropriate dimensions.You can try different interpolation methods.
                im = cv2.resize(im, im_size, interpolation=cv2.INTER_LINEAR)
                # By default OpenCV read with BGR format, return back to RGB.
                im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
                X.append(im)
                y.append(tag2idx[label])  # Append the label name to y
            except Exception as e:
                # In case annotations or metadata are found
                print("Not a picture")

    X = np.array(X)  # Convert list to numpy array.
    y = np.array(y, dtype=np.uint8)
    return X, y

Reading Data (4 classes) and displaying some random samples from each of them¶

Warning: In case the datasets contain non-image files (annotations/metadata/licenses/OS hidden files, etc.), a "Not a picture" message will appear; this does NOT affect the technique implemented.

In [7]:

# Path to the desired dataset.
DATASETS = [
    "Cotton-100619-Healthy-zz-V1-20210225102300",
    "Velvet leaf-220519-Weed-zz-V1-20210225104123",
    "Tomato-240519-Healthy-zz-V1-20210225103740",
    "Black nightsade-220519-Weed-zz-V1-20210225102034",
]
IM_SIZE = (128, 128)  # Dimensions to resize to.
path_list = []

for i, path in enumerate(DATASETS):
    # Define paths in an OS agnostic way.
    path_list.append(
        str(
            Path(Path.cwd()).parents[0].joinpath("eden_library_datasets").joinpath(path)
        )
    )

X, y = read_data(path_list, IM_SIZE)

100%|██████████| 47/47 [00:09<00:00,  4.74it/s]
100%|██████████| 129/129 [00:25<00:00,  5.08it/s]
100%|██████████| 201/201 [00:58<00:00,  3.45it/s]
100%|██████████| 123/123 [00:24<00:00,  4.97it/s]

In [8]:

# Class 0
plot_sample(X[y == 0])

In [9]:

# Class 1
plot_sample(X[y == 1])

In [10]:

# Class 2
plot_sample(X[y == 2])

In [11]:

# Class 3
plot_sample(X[y == 3])

Experimental Constants¶

In [12]:

EPOCHS = 50  # How many epochs each architecture is going to be trained for
MAX_TRIALS = 15  # Maximum trials for finding the best performing architecture
TUNER = "bayesian"  # Select between 'greedy', 'bayesian', 'hyperband' or 'random'
BATCH_SIZE = 24  # How many images are used for computing loss
PERFORMANCE_OBJECTIVE = "val_accuracy"  # Metric to be optimized

# Dataset Split Setting
TEST_SPLIT = 0.25
VAL_SPLIT = 0.15

RANDOM_STATE = 2021  # Set Seed for reproducibility purposes
WEIGHTS_FILE = "weights.h5"  # File to store the optimal model weights

Data preprocessing and dataset splitting among train-val-test sets¶

In [13]:

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=TEST_SPLIT, stratify=y, random_state=RANDOM_STATE
)

X_train, X_val, y_train, y_val = train_test_split(
    X_train, y_train, test_size=VAL_SPLIT, stratify=y_train, random_state=RANDOM_STATE
)

First approach¶

In this first example, we are using the ImageClassifier class, which tries every possible architecture combination, according to the search algorithm used. This approach may have the disadvantage of searching for solitions which may not be optimal from different points of view. For instance, rotating a crop vertically would not make sense for some image capturing applications. This issue will be addressed in the second example.

Creating the ImageClassifier object for searching the best architecture¶

Warning: Check experimental constants for more information.

In [14]:

# Initialize the image classifier.
clf = ak.ImageClassifier(
    overwrite=True, tuner=TUNER, objective=PERFORMANCE_OBJECTIVE, max_trials=MAX_TRIALS
)

Searching for the best architecture¶

In [15]:

# Feed the image classifier with training data.
clf.fit(
    X_train,
    y_train,
    callbacks=get_callbacks(WEIGHTS_FILE, EPOCHS // 5),
    batch_size=BATCH_SIZE,
    validation_data=(X_val, y_val),
    epochs=EPOCHS,
)

Trial 15 Complete [00h 01m 44s]
val_accuracy: 0.9824561476707458

Best val_accuracy So Far: 0.9824561476707458
Total elapsed time: 00h 09m 19s
INFO:tensorflow:Oracle triggered exit
INFO:tensorflow:Assets written to: ./image_classifier/best_model/assets
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.momentum
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

Evaluating model with default Accuracy metric¶

In [16]:

# Evaluate the best model with testing data
print(clf.evaluate(X_test, y_test))

4/4 [==============================] - 1s 93ms/step - loss: 0.1783 - accuracy: 0.9200
[0.17830902338027954, 0.9200000166893005]

Displaying the best architecture found by AutoKeras (after N trials)¶

Disclaimer: This is the best architecture found after the number of trials specified in MAX_TRIALS variable. If this value is increased, the architecture (and final performances) may differ.

In [17]:

model = clf.export_model()
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 128, 128, 3)]     0         
_________________________________________________________________
cast_to_float32 (CastToFloat (None, 128, 128, 3)       0         
_________________________________________________________________
xception (Functional)        (None, 4, 4, 2048)        20861480  
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048)              0         
_________________________________________________________________
dropout (Dropout)            (None, 2048)              0         
_________________________________________________________________
dense (Dense)                (None, 4)                 8196      
_________________________________________________________________
classification_head_1 (Softm (None, 4)                 0         
=================================================================
Total params: 20,869,676
Trainable params: 20,815,148
Non-trainable params: 54,528
_________________________________________________________________

Saving model for future use without need for retraining¶

In [18]:

try:
    model.save("model_autokeras", save_format="tf")
except Exception:
    model.save("model_autokeras.h5")

INFO:tensorflow:Assets written to: model_autokeras/assets

Reloading the previously stored model¶

In [19]:

loaded_model = load_model("model_autokeras", custom_objects=ak.CUSTOM_OBJECTS)

Checking final performance on test set¶

Instead of using the basic accuracy metric, we select the F₁-score with the different averaging approaches. Check out their advantages and disadvantages.

In [20]:

predicted_y = loaded_model.predict(X_test).astype(np.uint8)
predicted_y = np.argmax(predicted_y, axis=-1)
print("Micro-F1: %.3f" % f1_score(y_test, predicted_y, average="micro"))
print("Macro-F1: %.3f" % f1_score(y_test, predicted_y, average="macro"))

Micro-F1: 0.096
Macro-F1: 0.044

Second approach under a more constrained search space¶

Sometimes it is possible to speed up the hyper-parameter search by providing some constant architecture settings that can be based on previous evidence and scientific research. The way of implementing this option with AutoKeras is by using AutoModel instead of ImageClassifier, as shown in the previous example. Different blocks, such as the Xception or the ResNet ones, can be directly used and even merged for integrated feature extraction. Pre-processing steps can also be configured with their respective blocks (Normalization and ImageAugmentation). Check the AutoKeras documentation to explore all the available blocks.

Creating the AutoModel object for searching the best architecture¶

In [21]:

input_node = ak.ImageInput()
prep_node = ak.Normalization()(input_node)
prep_node = ak.ImageAugmentation(vertical_flip=False)(prep_node)
featExt_node1 = ak.XceptionBlock()(prep_node)
featExt_node2 = ak.ResNetBlock(version="v2")(prep_node)
out_node = ak.Merge()([featExt_node1, featExt_node2])
out_node_c = ak.ClassificationHead()(out_node)
clf = ak.AutoModel(
    inputs=input_node,
    outputs=[out_node_c],
    overwrite=True,
    tuner=TUNER,
    objective="val_accuracy",
    max_trials=MAX_TRIALS,
)

Searching for the best architecture¶

In [22]:

# Feed the image classifier with training data
clf.fit(
    X_train,
    y_train,
    callbacks=get_callbacks(WEIGHTS_FILE, EPOCHS // 5),
    batch_size=BATCH_SIZE,
    validation_data=(X_val, y_val),
    epochs=EPOCHS,
)

Trial 15 Complete [00h 02m 23s]
val_accuracy: 0.9824561476707458

Best val_accuracy So Far: 1.0
Total elapsed time: 00h 25m 17s
INFO:tensorflow:Oracle triggered exit
INFO:tensorflow:Assets written to: ./auto_model/best_model/assets
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

Evaluating model with default Accuracy metric¶

In [23]:

# Evaluate the best model with testing data
print(clf.evaluate(X_test, y_test))

4/4 [==============================] - 2s 176ms/step - loss: 0.1503 - accuracy: 0.9680
[0.1503141224384308, 0.9679999947547913]

Displaying the best architecture found by AutoKeras (after N trials)¶

Disclaimer: This is the best architecture found after the number of trials specified in MAX_TRIALS variable. If this value is increased, the architecture (and final performances) may differ.

In [24]:

model = clf.export_model()
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 128, 128, 3) 0                                            
__________________________________________________________________________________________________
cast_to_float32 (CastToFloat32) (None, 128, 128, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
normalization (Normalization)   (None, 128, 128, 3)  7           cast_to_float32[0][0]            
__________________________________________________________________________________________________
random_translation (RandomTrans (None, 128, 128, 3)  0           normalization[0][0]              
__________________________________________________________________________________________________
random_flip (RandomFlip)        (None, 128, 128, 3)  0           random_translation[0][0]         
__________________________________________________________________________________________________
random_contrast (RandomContrast (None, 128, 128, 3)  0           random_flip[0][0]                
__________________________________________________________________________________________________
resizing (Resizing)             (None, 224, 224, 3)  0           random_contrast[0][0]            
__________________________________________________________________________________________________
xception (Functional)           (None, None, None, 2 20861480    random_contrast[0][0]            
__________________________________________________________________________________________________
resnet50v2 (Functional)         (None, 7, 7, 2048)   23564800    resizing[0][0]                   
__________________________________________________________________________________________________
flatten (Flatten)               (None, 32768)        0           xception[0][0]                   
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 100352)       0           resnet50v2[0][0]                 
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 133120)       0           flatten[0][0]                    
                                                                 flatten_1[0][0]                  
__________________________________________________________________________________________________
dropout (Dropout)               (None, 133120)       0           concatenate[0][0]                
__________________________________________________________________________________________________
dense (Dense)                   (None, 4)            532484      dropout[0][0]                    
__________________________________________________________________________________________________
classification_head_1 (Softmax) (None, 4)            0           dense[0][0]                      
==================================================================================================
Total params: 44,958,771
Trainable params: 44,858,796
Non-trainable params: 99,975
__________________________________________________________________________________________________

Checking final performance on test set¶

In [25]:

predicted_y = clf.predict(X_test).astype(np.uint8)
predicted_y = predicted_y.reshape(predicted_y.shape[0])
print("Micro-F1: %.3f" % f1_score(y_test, predicted_y, average="micro"))
print("Macro-F1: %.3f" % f1_score(y_test, predicted_y, average="macro"))

4/4 [==============================] - 0s 96ms/step
Micro-F1: 0.968
Macro-F1: 0.963

Possible Extensions¶

Use different tuner: greedy, hyperband or random searches are available.
Change the epochs and batch size.
Change the image size.
Try other customized search space in the second example.

Bibliography¶

Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.; Blum, M.; Hutter, F. Efficient and robust automated machine learning. Adv. Neural. Inf. Process. Syst. 2015, 28, 2962–2970.

Kotthoff, L.; Thornton, C.; Hoos, H.H.; Hutter, F.; Leyton-Brown, K. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. J. Mach. Learn. Res. 2017, 18, 826–830.

Hsieh, J.Y.; Huang, W.; Yang, H.T.; Lin, C.C.; Fan, Y.C.; Chen, H. Building the Rice Blast Disease Prediction Model Based on Machine Learning and Neural Networks; EasyChair: Manchester, UK, 2019.

Kiala, Z.; Mutanga, O.; Odindi, J.; Peerbhay, K.Y.; Slotow, R. Automated classification of a tropical landscape infested by Parthenium weed (Parthenium hyterophorus). J. Remote Sens. 2020, 41, 8497–8519

Koh, J.C.; Spangenberg, G.; Kant, S. Automated Machine Learning for High-Throughput Image-Based Plant Phenotyping. bioRxiv 2020.

Espejo-Garcia, B.; Malounas I.; Vali, E.; Fountas, S. Testing the Suitability of Automated Machine Learning for Weeds Identification. AI. 2021.

https://autokeras.com/