Notebook

Autoencoders¶

Data-driven data dimensionality reduction (compression)¶

An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.

Map the initial data to a lower dimensional representation (encoder)
Map back the low dimensional data to initial data size (decoder)

Used for:

Data denoising
Dimensionality reduction

Unsupervised: No labels or output values¶

Question: What is the loss function?¶

Example (from https://blog.keras.io/building-autoencoders-in-keras.html)

In [1]:

#!pip install keras==2.8

from keras.layers import Input, Dense
from keras.models import Model

# this is the size of our encoded representations
encoding_dim = 32  # 32 floats -> compression of factor 24.5, assuming the input is 784 floats

# this is our input placeholder
input_img = Input(shape=(784,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu')(input_img)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(784, activation='sigmoid')(encoded)

# this model maps an input to its reconstruction
autoencoder = Model(input_img, decoded)

In [2]:

from keras.utils.vis_utils import plot_model
plot_model(autoencoder, show_shapes=True, show_layer_names=True)

Out[2]:

In [3]:

# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)

In [4]:

from keras.utils.vis_utils import plot_model
plot_model(encoder, show_shapes=True, show_layer_names=True)

Out[4]:

In [5]:

# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))

In [6]:

plot_model(decoder, show_shapes=True, show_layer_names=True)

Out[6]:

In [7]:

autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

In [8]:

from keras.datasets import mnist
import numpy as np
(x_train, _), (x_test, _) = mnist.load_data()

In [9]:

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)

(60000, 784)
(10000, 784)

In [ ]:

autoencoder.fit(x_train, x_train,
                epochs=50,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))

Epoch 1/50
235/235 [==============================] - 73s 309ms/step - loss: 0.4377 - val_loss: 0.4386
Epoch 2/50
235/235 [==============================] - 70s 300ms/step - loss: 0.4329 - val_loss: 0.4334
Epoch 3/50
235/235 [==============================] - 70s 300ms/step - loss: 0.4276 - val_loss: 0.4277
Epoch 4/50
235/235 [==============================] - 70s 300ms/step - loss: 0.4217 - val_loss: 0.4214
Epoch 5/50
235/235 [==============================] - 71s 300ms/step - loss: 0.4150 - val_loss: 0.4141
Epoch 6/50
235/235 [==============================] - 70s 299ms/step - loss: 0.4074 - val_loss: 0.4057
Epoch 7/50
235/235 [==============================] - 70s 298ms/step - loss: 0.3983 - val_loss: 0.3956
Epoch 8/50
235/235 [==============================] - 69s 296ms/step - loss: 0.3873 - val_loss: 0.3831
Epoch 9/50
235/235 [==============================] - 70s 297ms/step - loss: 0.3743 - val_loss: 0.3692
Epoch 10/50
235/235 [==============================] - 71s 304ms/step - loss: 0.3602 - val_loss: 0.3544
Epoch 11/50
235/235 [==============================] - 69s 296ms/step - loss: 0.3455 - val_loss: 0.3393
Epoch 12/50
235/235 [==============================] - 70s 296ms/step - loss: 0.3310 - val_loss: 0.3251
Epoch 13/50
235/235 [==============================] - 70s 296ms/step - loss: 0.3177 - val_loss: 0.3124
Epoch 14/50
235/235 [==============================] - 70s 297ms/step - loss: 0.3063 - val_loss: 0.3020
Epoch 15/50
235/235 [==============================] - 69s 295ms/step - loss: 0.2970 - val_loss: 0.2936
Epoch 16/50
235/235 [==============================] - 70s 296ms/step - loss: 0.2897 - val_loss: 0.2871
Epoch 17/50
235/235 [==============================] - 70s 296ms/step - loss: 0.2839 - val_loss: 0.2818
Epoch 18/50
235/235 [==============================] - 70s 296ms/step - loss: 0.2792 - val_loss: 0.2775
Epoch 19/50
235/235 [==============================] - 70s 296ms/step - loss: 0.2752 - val_loss: 0.2738
Epoch 20/50
235/235 [==============================] - 71s 304ms/step - loss: 0.2717 - val_loss: 0.2704
Epoch 21/50
235/235 [==============================] - 69s 295ms/step - loss: 0.2686 - val_loss: 0.2674
Epoch 22/50
235/235 [==============================] - 69s 295ms/step - loss: 0.2657 - val_loss: 0.2645
Epoch 23/50
235/235 [==============================] - 69s 296ms/step - loss: 0.2630 - val_loss: 0.2620
Epoch 24/50
235/235 [==============================] - 70s 296ms/step - loss: 0.2605 - val_loss: 0.2596
Epoch 25/50
235/235 [==============================] - 69s 295ms/step - loss: 0.2582 - val_loss: 0.2574
Epoch 26/50
235/235 [==============================] - 69s 296ms/step - loss: 0.2562 - val_loss: 0.2554
Epoch 27/50
235/235 [==============================] - 69s 295ms/step - loss: 0.2543 - val_loss: 0.2536
Epoch 28/50
235/235 [==============================] - 70s 299ms/step - loss: 0.2525 - val_loss: 0.2519
Epoch 29/50
235/235 [==============================] - 70s 300ms/step - loss: 0.2509 - val_loss: 0.2503
Epoch 30/50
235/235 [==============================] - 70s 300ms/step - loss: 0.2494 - val_loss: 0.2489
Epoch 31/50
235/235 [==============================] - 70s 300ms/step - loss: 0.2479 - val_loss: 0.2475
Epoch 32/50
235/235 [==============================] - 70s 299ms/step - loss: 0.2466 - val_loss: 0.2462
Epoch 33/50
235/235 [==============================] - 70s 299ms/step - loss: 0.2453 - val_loss: 0.2449
Epoch 34/50
202/235 [========================>.....] - ETA: 9s - loss: 0.2442

In [11]:

# encode and decode some digits
# note that we take them from the *test* set
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)

In [12]:

import matplotlib.pyplot as plt

n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

In [13]:

from keras import regularizers

encoding_dim = 32

input_img = Input(shape=(784,))
# add a Dense layer with a L1 activity regularizer
encoded = Dense(encoding_dim, activation='relu',
                activity_regularizer=regularizers.l1(10e-5))(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)

autoencoder = Model(input_img, decoded)

Convolutional autoencoder¶

In [14]:

from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K

input_img = Input(shape=(28, 28, 1))

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

In [15]:

from keras.datasets import mnist
import numpy as np

(x_train, _), (x_test, _) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))

In [16]:

autoencoder.fit(x_train, x_train,
                epochs=10,
                batch_size=128,
                shuffle=True,
                validation_data=(x_test, x_test))

Epoch 1/10
469/469 [==============================] - 76s 161ms/step - loss: 0.6997 - val_loss: 0.6969
Epoch 2/10
469/469 [==============================] - 73s 157ms/step - loss: 0.6938 - val_loss: 0.6903
Epoch 3/10
469/469 [==============================] - 74s 157ms/step - loss: 0.6859 - val_loss: 0.6806
Epoch 4/10
469/469 [==============================] - 74s 157ms/step - loss: 0.6712 - val_loss: 0.6579
Epoch 5/10
469/469 [==============================] - 74s 158ms/step - loss: 0.6261 - val_loss: 0.5768
Epoch 6/10
469/469 [==============================] - 74s 157ms/step - loss: 0.5107 - val_loss: 0.4776
Epoch 7/10
469/469 [==============================] - 74s 157ms/step - loss: 0.4690 - val_loss: 0.4682
Epoch 8/10
469/469 [==============================] - 73s 157ms/step - loss: 0.4606 - val_loss: 0.4601
Epoch 9/10
469/469 [==============================] - 74s 157ms/step - loss: 0.4526 - val_loss: 0.4519
Epoch 10/10
469/469 [==============================] - 78s 166ms/step - loss: 0.4444 - val_loss: 0.4433

Out[16]:

<keras.callbacks.History at 0x7f5b1d8da110>

Image denoising¶

In [17]:

from keras.datasets import mnist
import numpy as np

(x_train, _), (x_test, _) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))

noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 

x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)

In [18]:

n = 10
plt.figure(figsize=(20, 2))
for i in range(n):
    ax = plt.subplot(1, n, i)
    plt.imshow(x_test_noisy[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-ed9cde791bb5> in <module>()
      2 plt.figure(figsize=(20, 2))
      3 for i in range(n):
----> 4     ax = plt.subplot(1, n, i)
      5     plt.imshow(x_test_noisy[i].reshape(28, 28))
      6     plt.gray()

/usr/local/lib/python3.7/dist-packages/matplotlib/pyplot.py in subplot(*args, **kwargs)
   1028 
   1029     fig = gcf()
-> 1030     a = fig.add_subplot(*args, **kwargs)
   1031     bbox = a.bbox
   1032     byebye = []

/usr/local/lib/python3.7/dist-packages/matplotlib/figure.py in add_subplot(self, *args, **kwargs)
   1417                     self._axstack.remove(ax)
   1418 
-> 1419             a = subplot_class_factory(projection_class)(self, *args, **kwargs)
   1420 
   1421         return self._add_axes_internal(key, a)

/usr/local/lib/python3.7/dist-packages/matplotlib/axes/_subplots.py in __init__(self, fig, *args, **kwargs)
     64                 if num < 1 or num > rows*cols:
     65                     raise ValueError(
---> 66                         f"num must be 1 <= num <= {rows*cols}, not {num}")
     67                 self._subplotspec = GridSpec(
     68                         rows, cols, figure=self.figure)[int(num) - 1]

ValueError: num must be 1 <= num <= 10, not 0

<Figure size 1440x144 with 0 Axes>

In [ ]:

input_img = Input(shape=(28, 28, 1))  # adapt this if using `channels_first` image data format

x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (7, 7, 32)

x = Conv2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

In [ ]:

autoencoder.fit(x_train_noisy, x_train,
                epochs=100,
                batch_size=128,
                shuffle=True,
                validation_data=(x_test_noisy, x_test),
                callbacks=[TensorBoard(log_dir='/tmp/tb', histogram_freq=0, write_graph=False)])