In this post, we will learn how to build custom loss functions with function and class. This is the summary of lecture "Custom Models, Layers and Loss functions with Tensorflow" from DeepLearning.AI.
import tensorflow as tf
from tensorflow.keras.utils import plot_model
from tensorflow.keras import backend as K
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
In this section, we'll walk through how to create custom loss functions. In particular, we'll code the Huber Loss and use that in training the model.
Our dummy dataset is just a pair of arrays xs
and ys
defined by the relationship $y = 2x - 1$. xs
are the inputs while ys
are the labels.
# inputs
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
# labels
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)
plt.scatter(xs, ys);
Let's build a simple model and train using a built-in loss function like the mean_squared_error
.
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=[1])
])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(xs, ys, epochs=500, verbose=0)
<tensorflow.python.keras.callbacks.History at 0x7f19f83427d0>
y_mse = model.predict([10.0])
y_mse
array([[18.978544]], dtype=float32)
plt.scatter(xs, ys)
plt.scatter(10.0, y_mse, c='r');
Now let's see how we can use a custom loss. We first define a function that accepts the ground truth labels (y_true
) and model predictions (y_pred
) as parameters. We then compute and return the loss value in the function definition.
The definition of Huber Loss is like this:
$$ L_{\delta}(a) = \begin{cases} \frac{1}{2} (y - f(x))^2 \quad & \text{ for } \vert a \vert \le \delta, \\ \delta (\vert y - f(x) \vert - \frac{1}{2} \delta) \quad & \text{ otherwise} \\ \end{cases} $$def my_huber_loss(y_true, y_pred):
threshold = 1.
error = y_true - y_pred
is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - threshold / 2)
return tf.where(is_small_error, small_error_loss, big_error_loss)
Using the loss function is as simple as specifying the loss function in the loss
argument of model.compile()
.
model = tf.keras.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1,])
])
model.compile(optimizer='sgd', loss=my_huber_loss)
model.fit(xs, ys, epochs=500, verbose=0)
<tensorflow.python.keras.callbacks.History at 0x7f19ec25ba10>
y_hl = model.predict([10.0])
y_hl
array([[18.722095]], dtype=float32)
plt.scatter(xs, ys);
plt.scatter(10.0, y_mse, label='mse');
plt.scatter(10.0, y_hl, label='huber_loss');
plt.grid()
plt.legend();
As before, this model will be trained on the xs
and ys
below where the relationship is $y = 2x-1$. Thus, later, when we test for x=10
, whichever version of the model gets the closest answer to 19
will be deemed more accurate.
The loss
argument in model.compile()
only accepts functions that accepts two parameters: the ground truth (y_true
) and the model predictions (y_pred
). If we want to include a hyperparameter that we can tune, then we can define a wrapper function that accepts this hyperparameter.
# wrapper function that accepts the hyperparameter
def my_huber_loss_with_threshold(threshold):
# function that accepts the ground truth and predictions
def my_huber_loss(y_true, y_pred):
error = y_true - y_pred
is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - (threshold / 2))
return tf.where(is_small_error, small_error_loss, big_error_loss)
# return the inner function tuned by the hyperparameter
return my_huber_loss
We can now specify the loss
as the wrapper function above. Notice that we can now set the threshold
value. Try varying this value and see the results you get.
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=[1])
])
model.compile(optimizer='sgd', loss=my_huber_loss_with_threshold(threshold=1.2))
model.fit(xs, ys, epochs=500, verbose=0)
<tensorflow.python.keras.callbacks.History at 0x7f197c738510>
y_hlt = model.predict([10.0])
y_hlt
array([[18.618975]], dtype=float32)
plt.scatter(xs, ys);
plt.scatter(10.0, y_mse, label='mse');
plt.scatter(10.0, y_hl, label='huber_loss');
plt.scatter(10.0, y_hlt, label='huber_loss_1.2')
plt.grid()
plt.legend();
We can also implement our custom loss as a class. It inherits from the Keras Loss class and the syntax and required methods are shown below.
from tensorflow.keras.losses import Loss
class MyHuberLoss(Loss):
# initialize instance attributes
def __init__(self, threshold=1):
super(MyHuberLoss, self).__init__()
self.threshold = threshold
# Compute loss
def call(self, y_true, y_pred):
error = y_true - y_pred
is_small_error = tf.abs(error) <= self.threshold
small_error_loss = tf.square(error) / 2
big_error_loss = self.threshold * (tf.abs(error) - self.threshold / 2)
return tf.where(is_small_error, small_error_loss, big_error_loss)
You can specify the loss by instantiating an object from your custom loss class.
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=[1,])
])
model.compile(optimizer='sgd', loss=MyHuberLoss(threshold=1.02))
model.fit(xs, ys, epochs=500, verbose=0)
<tensorflow.python.keras.callbacks.History at 0x7f19f831af50>
y_hltc = model.predict([10.0])
y_hltc
array([[18.58202]], dtype=float32)