Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch

Sebastian Raschka

CPython 3.7.1
IPython 7.2.0

torch 1.0.0

• Runs on CPU or GPU (if available)

Model Zoo -- Cyclical Learning Rate in PyTorch¶

This notebook will go over the following topics in the order listed below:

1. Briefly explain the concept behind the cyclical learning rate
2. Use the "LR range test" to choose a good base and max learning rate for the cyclical leraning rate
3. Train a simple convolutional neural net on CIFAR-10 using the cyclical learning rate

Cyclical Learning Rate Concept¶

In his paper [1], Leslie N. Smith introduced the concept of cyclical learning rates, that is, learning rates that periodically alternative between a user-specified minimum and maximum learning rate.

Varying the learning rate between between specified bounds, as implemented by Smith, is cheaper to compute than the nowadays popular approach using adaptive learning rates. Note that adaptive learning rate can also be combined with the concept of cyclical learning rates.

The idea behind cyclical learning rates is that while increasing the learning rate can be harmful short term it can be beneficial in the long run. Concretely, the three methods introduced by Smith (and implemented in this notebook) are

• triangular: The base approach, varying between a lower and an upper bound, as illustrated in the figure below
• triangular2: Same as triangular, but learning rate difference is cut in half at the end of each cycle. This means the learning rate difference drops after each cycle -- exp_range: The learning rate varies between the minimum and maximum boundaries and each boundary value declines by an exponential factor of $gamma^{iteration}$

{insert figure}

References¶

Following the description in the paper, the different cyclical learning rates are very simple to implement, as shown below:

In [2]:
import numpy as np

def cyclical_learning_rate(batch_step,
step_size,
base_lr=0.001,
max_lr=0.006,
mode='triangular',
gamma=0.999995):

cycle = np.floor(1 + batch_step / (2. * step_size))
x = np.abs(batch_step / float(step_size) - 2 * cycle + 1)

lr_delta = (max_lr - base_lr) * np.maximum(0, (1 - x))

if mode == 'triangular':
pass
elif mode == 'triangular2':
lr_delta = lr_delta * 1 / (2. ** (cycle - 1))
elif mode == 'exp_range':
lr_delta = lr_delta * (gamma**(batch_step))
else:
raise ValueError('mode must be "triangular", "triangular2", or "exp_range"')

lr = base_lr + lr_delta

return lr


To ensure that the learning rate works as intended, let us plot the learning rate variation for a dry run. Note that batch_step is a variable that tracks the total number of times a model has been updated. For instance, if we run the training loop over 5 epochs (5 passes over the training set), where each epoch is split into 100 batches, then we have a batch_step count of 5 * 100 = 500 at the end of the training.

In [3]:
num_epochs = 50
num_train = 50000
batch_size = 100
iter_per_ep = num_train // batch_size


Triangular

In [4]:
%matplotlib inline
import matplotlib.pyplot as plt

batch_step = -1
collect_lr = []
for e in range(num_epochs):
for i in range(iter_per_ep):
batch_step += 1
cur_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=iter_per_ep*5)

collect_lr.append(cur_lr)

plt.scatter(range(len(collect_lr)), collect_lr)
plt.ylim([0.0, 0.01])
plt.xlim([0, num_epochs*iter_per_ep + 5000])
plt.show()


As we can see above, with a batchsize of 100 and for a training set of 50,000 training example, we have 50,000=500 iterations per epoch. With a cycle length of 5*iterations_per_epoch=25,000, the learning rate reaches the base_lr every 5 epochs, which is equal to 25,000 batch updates. The stepsize is defined as cycle/2, i.e., 25,000/2 = 12,500 batch updates.

Triangular2

The triangular2 learning rate is similar to the triangular learning rate but cuts the max. learning rate in halve after each cycle.

In [5]:
collect_lr = []
batch_step = -1
for e in range(num_epochs):
for i in range(iter_per_ep):
batch_step += 1
cur_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=iter_per_ep*4,
mode='triangular2')

collect_lr.append(cur_lr)

plt.scatter(range(len(collect_lr)), collect_lr)
plt.ylim([0.0, 0.01])
plt.xlim([0, num_epochs*iter_per_ep + 5000])
plt.show()


Exp_range

The exp_range option adds an additional hyperparameter, gamma to decay the learning rate exponentially.

In [6]:
collect_lr = []
batch_step = -1
for e in range(num_epochs):
for i in range(iter_per_ep):
batch_step += 1
cur_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=iter_per_ep*4,
mode='exp_range',
gamma=0.99998)

collect_lr.append(cur_lr)

plt.scatter(range(len(collect_lr)), collect_lr)
plt.ylim([0.0, 0.01])
plt.xlim([0, num_epochs*iter_per_ep + 5000])
plt.show()


Torch Imports¶

In [7]:
import time
import torch
import torch.nn.functional as F
from torchvision import datasets
from torchvision import transforms
from torch.utils.data.sampler import SubsetRandomSampler

if torch.cuda.is_available():
torch.backends.cudnn.deterministic = True


Settings and Dataset¶

In [8]:
##########################
### SETTINGS
##########################

# Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Hyperparameters
random_seed = 1
batch_size = 128

# Architecture
num_classes = 10

##########################
### CIFAR-10 DATASET
##########################

# Note transforms.ToTensor() scales input images
# to 0-1 range

## Create a validation dataset
np.random.seed(random_seed)
idx = np.arange(50000) # the size of CIFAR10-train
np.random.shuffle(idx)
val_idx, train_idx = idx[:1000], idx[1000:]
train_sampler = SubsetRandomSampler(train_idx)
val_sampler = SubsetRandomSampler(val_idx)

train_dataset = datasets.CIFAR10(root='data',
train=True,
transform=transforms.ToTensor(),

test_dataset = datasets.CIFAR10(root='data',
train=False,
transform=transforms.ToTensor())

batch_size=batch_size,
# shuffle=True, # Subsetsampler already shuffles
sampler=train_sampler)

batch_size=batch_size,
# shuffle=True,
sampler=val_sampler)

batch_size=batch_size,
shuffle=False)

# Checking the dataset
print('Image batch dimensions:', images.shape)
print('Image label dimensions:', labels.shape)
break

cnt = 0
cnt += images.shape[0]
print('Number of training examples:', cnt)

cnt = 0
cnt += images.shape[0]
print('Number of validation instances:', cnt)

cnt = 0
cnt += images.shape[0]
print('Number of test instances:', cnt)

Files already downloaded and verified
Image batch dimensions: torch.Size([128, 3, 32, 32])
Image label dimensions: torch.Size([128])
Number of training examples: 49000
Number of validation instances: 1000
Number of test instances: 10000


Model¶

Note that this is a very simple convolutional network in this notebook, which is not geared to reach best performance on CIFAR-10 but rather to test the implementation of the cyclical learning rate concept.

In [9]:
##########################
### MODEL
##########################

class ConvNet(torch.nn.Module):

def __init__(self, num_classes):
super(ConvNet, self).__init__()

# (w - k + 2*p)/s + 1 = o
# => p = (s(o-1) - w + k)/2

# 32x32x3 => 32x32x6
self.conv_1 = torch.nn.Conv2d(in_channels=3,
out_channels=6,
kernel_size=(3, 3),
stride=(1, 1),
padding=1) # (1(32-1) - 32 + 3) / 2) = 1
# 32x32x4 => 16x16x6
self.pool_1 = torch.nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2),
padding=0) # (2(16-1) - 32 + 2) = 0

# 16x16x6 => 16x16x12
self.conv_2 = torch.nn.Conv2d(in_channels=6,
out_channels=12,
kernel_size=(3, 3),
stride=(1, 1),
padding=1) # (1(16-1) - 16 + 3) / 2 = 1
# 16x16x12 => 8x8x12
self.pool_2 = torch.nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2),
padding=0) # (2(8-1) - 16 + 2) = 0

# 8x8x12 => 8x8x18
self.conv_3 = torch.nn.Conv2d(in_channels=12,
out_channels=18,
kernel_size=(3, 3),
stride=(1, 1),
padding=1) # (1(8-1) - 8 + 3) / 2 = 1
# 8x8x18 => 4x4x18
self.pool_3 = torch.nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2),
padding=0) # (2(4-1) - 8 + 2) = 0

# 4x4x18 => 4x4x24
self.conv_4 = torch.nn.Conv2d(in_channels=18,
out_channels=24,
kernel_size=(3, 3),
stride=(1, 1),
# 4x4x24 => 2x2x24
self.pool_4 = torch.nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2),

# 2x2x24 => 2x2x30
self.conv_5 = torch.nn.Conv2d(in_channels=24,
out_channels=30,
kernel_size=(3, 3),
stride=(1, 1),
# 2x2x30 => 1x1x30
self.pool_5 = torch.nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2),

self.linear_1 = torch.nn.Linear(1*1*30, num_classes)

def forward(self, x):
out = self.conv_1(x)
out = F.relu(out)
out = self.pool_1(out)

out = self.conv_2(out)
out = F.relu(out)
out = self.pool_2(out)

out = self.conv_3(out)
out = F.relu(out)
out = self.pool_3(out)

out = self.conv_4(out)
out = F.relu(out)
out = self.pool_4(out)

out = self.conv_5(out)
out = F.relu(out)
out = self.pool_5(out)

logits = self.linear_1(out.view(-1, 1*1*30))
probas = F.softmax(logits, dim=1)
return logits, probas


LR Range Test¶

The LR range test is a simple heuristic that is also described in Smith's paper. Essentially, it's a quick-and-dirty approach to find good values for the base_lr and max_lr (hyperparameters of the cyclical learning rate).

It works as follows:

We run the training for 5-10 epochs and increase the learning rate linearly up to an upper bound. We select the cut-off where the (train or validation) accuracy starts improving as the base_lr for the cyclical learning rate. The max_lr for the cyclical learning rate is determined in a similar manner, by choosing the cut-off value where the accuracy improvements stop, decrease, or widely fluctuate.

Note that we can use the cyclical_learning_rate function to compute the learning rates for the increasing interval by setting step_size=num_epochs*iter_per_ep:

In [10]:
num_epochs = 10

batch_step = -1
collect_lr = []
for e in range(num_epochs):
for i in range(iter_per_ep):
batch_step += 1
cur_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=num_epochs*iter_per_ep)

collect_lr.append(cur_lr)

plt.scatter(range(len(collect_lr)), collect_lr)
plt.ylim([0.0, 0.01])
plt.xlim([0, num_epochs*iter_per_ep + 5000])
plt.show()


Utility Functions

In [11]:
def compute_accuracy(model, data_loader):
correct_pred, num_examples = 0, 0
features = features.to(device)
targets = targets.to(device)
logits, probas = model(features)
_, predicted_labels = torch.max(probas, 1)
num_examples += targets.size(0)
correct_pred += (predicted_labels == targets).sum()
return correct_pred.float()/num_examples * 100


Train Model/Run LR Range Test

In [12]:
#################################
### Setting for this run
#################################

num_epochs = 10
base_lr = 0.01
max_lr = 0.2

#################################
### Init Model
#################################

torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)
model = model.to(device)

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=base_lr)

########################################################################
# Collect the data to be evaluated via the LR Range Test
collect = {'lr': [], 'cost': [], 'train_batch_acc': [], 'val_acc': []}
########################################################################

batch_step = -1
cur_lr = base_lr

start_time = time.time()
for epoch in range(num_epochs):
for batch_idx, (features, targets) in enumerate(train_loader):

batch_step += 1
features = features.to(device)
targets = targets.to(device)

### FORWARD AND BACK PROP
logits, probas = model(features)
cost = cost_fn(logits, targets)

cost.backward()

### UPDATE MODEL PARAMETERS
optimizer.step()

#############################################
# Logging
if not batch_step % 200:
print('Total batch # %5d/%d' % (batch_step,
iter_per_ep*num_epochs),
end='')
print('   Curr. Batch Cost: %.5f' % cost)

#############################################
# Collect stats
model = model.eval()
train_acc = compute_accuracy(model, [[features, targets]])
collect['lr'].append(cur_lr)
collect['train_batch_acc'].append(train_acc)
collect['val_acc'].append(val_acc)
collect['cost'].append(cost)
model = model.train()
#############################################
# update learning rate
cur_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=num_epochs*iter_per_ep,
base_lr=base_lr,
max_lr=max_lr)
for g in optimizer.param_groups:
g['lr'] = cur_lr
############################################

print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))

print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))

Total batch #     0/3830   Curr. Batch Cost: 2.31266
Total batch #   200/3830   Curr. Batch Cost: 2.30711
Total batch #   400/3830   Curr. Batch Cost: 2.30392
Total batch #   600/3830   Curr. Batch Cost: 2.30356
Total batch #   800/3830   Curr. Batch Cost: 2.30203
Total batch #  1000/3830   Curr. Batch Cost: 2.30223
Total batch #  1200/3830   Curr. Batch Cost: 2.30101
Total batch #  1400/3830   Curr. Batch Cost: 2.30159
Total batch #  1600/3830   Curr. Batch Cost: 2.25974
Total batch #  1800/3830   Curr. Batch Cost: 2.02467
Total batch #  2000/3830   Curr. Batch Cost: 2.01952
Total batch #  2200/3830   Curr. Batch Cost: 1.90831
Total batch #  2400/3830   Curr. Batch Cost: 1.56817
Total batch #  2600/3830   Curr. Batch Cost: 1.71451
Total batch #  2800/3830   Curr. Batch Cost: 2.13523
Total batch #  3000/3830   Curr. Batch Cost: 1.62590
Total batch #  3200/3830   Curr. Batch Cost: 1.42501
Total batch #  3400/3830   Curr. Batch Cost: 1.62436
Total batch #  3600/3830   Curr. Batch Cost: 1.55984
Total batch #  3800/3830   Curr. Batch Cost: 1.48068

In [13]:
plt.plot(collect['lr'], collect['train_batch_acc'], label='train_batch_acc')
plt.plot(collect['lr'], collect['val_acc'], label='val_acc')
plt.xlabel('Learning Rate')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

plt.plot(collect['lr'], collect['cost'])
plt.xlabel('Learning Rate')
plt.ylabel('Current Batch Cost')
plt.show()


Looking at the graphs above, in particular the validation accuracy, it's not immediately obvious to find the 2 points:

1. Where the accuracy starts increasing
2. Where the accuracy starts dropping or ceases to improve

However, point 1) may be at 0.08-0.09, and point 2) may be at 0.175 or even 0.2 (or even beyond that, if we would keep increasing the learning rate).

Also note that this heuristic is less "clean" as starting the epoch from scratch with each incremental learning rate change, which adds addtional noise to the interpretation (including questions like "by how much did the cost drop/accuracy improve just because of going downhill on the cost surface and the gradients becoming smaller?")

Train with Cyclical Learning Rate (triangular)¶

Below, the triangular (default) cyclical learning rate training procedure is run with a base_lr=0.09 and max_lr=0.175. Based on the LR Range Tests graphs above, a max_lr >= 0.2 may even be reasonable. However, in practice (based on my experience and some trial runs with these settings), such large learning rates would increase convergence problems using a vanilla SGD optimizer (as it is done here).

In [14]:
#################################
### Setting for this run
#################################

num_epochs = 150
base_lr = 0.09
max_lr = 0.175

#################################
### Init Model
#################################

torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)
model = model.to(device)

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=base_lr)

########################################################################
# Collect the data to be evaluated via the LR Range Test
collect = {'epoch': [], 'cost': [], 'train_acc': [], 'val_acc': []}
########################################################################

start_time = time.time()
for epoch in range(num_epochs):
epoch_avg_cost = 0.
model = model.train()
for batch_idx, (features, targets) in enumerate(train_loader):

features = features.to(device)
targets = targets.to(device)

### FORWARD AND BACK PROP
logits, probas = model(features)
cost = cost_fn(logits, targets)

cost.backward()

### UPDATE MODEL PARAMETERS
optimizer.step()

epoch_avg_cost += cost

#############################################
# Logging
if not batch_step % 600:
print('Batch %5d/%d' % (batch_step, iter_per_ep*num_epochs),
end='')
print('   Cost: %.5f' % cost)

#############################################
# Collect stats
model = model.eval()
epoch_avg_cost /= batch_idx+1
collect['epoch'].append(epoch+1)
collect['val_acc'].append(val_acc)
collect['train_acc'].append(train_acc)
collect['cost'].append(epoch_avg_cost / iter_per_ep)

################################################
# Logging
print('Epoch %3d' % (epoch+1), end='')
print('  |  Train/Valid Acc: %.2f/%.2f' % (train_acc, val_acc))

#############################################
# update learning rate
base_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=num_epochs*iter_per_ep,
base_lr=base_lr,
max_lr=max_lr)
for g in optimizer.param_groups:
g['lr'] = base_lr
############################################

print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))

print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))

Epoch   1  |  Train/Valid Acc: 10.12/11.50
Epoch   2  |  Train/Valid Acc: 11.56/11.80
Epoch   3  |  Train/Valid Acc: 25.08/23.20
Epoch   4  |  Train/Valid Acc: 29.49/30.20
Epoch   5  |  Train/Valid Acc: 39.37/38.70
Epoch   6  |  Train/Valid Acc: 41.30/39.30
Epoch   7  |  Train/Valid Acc: 39.99/36.40
Epoch   8  |  Train/Valid Acc: 43.76/41.30
Epoch   9  |  Train/Valid Acc: 49.54/47.30
Epoch  10  |  Train/Valid Acc: 50.19/48.70
Epoch  11  |  Train/Valid Acc: 52.60/48.90
Epoch  12  |  Train/Valid Acc: 53.14/52.00
Epoch  13  |  Train/Valid Acc: 55.03/52.20
Epoch  14  |  Train/Valid Acc: 52.50/51.50
Epoch  15  |  Train/Valid Acc: 56.78/54.20
Epoch  16  |  Train/Valid Acc: 57.11/54.60
Epoch  17  |  Train/Valid Acc: 53.47/51.40
Epoch  18  |  Train/Valid Acc: 58.84/54.80
Epoch  19  |  Train/Valid Acc: 60.68/57.80
Epoch  20  |  Train/Valid Acc: 59.67/55.20
Epoch  21  |  Train/Valid Acc: 60.42/55.40
Epoch  22  |  Train/Valid Acc: 62.05/57.10
Epoch  23  |  Train/Valid Acc: 58.89/53.80
Epoch  24  |  Train/Valid Acc: 61.83/58.90
Epoch  25  |  Train/Valid Acc: 64.20/57.50
Epoch  26  |  Train/Valid Acc: 62.74/56.80
Epoch  27  |  Train/Valid Acc: 60.05/56.20
Epoch  28  |  Train/Valid Acc: 61.31/56.20
Epoch  29  |  Train/Valid Acc: 65.26/59.30
Epoch  30  |  Train/Valid Acc: 64.57/59.40
Epoch  31  |  Train/Valid Acc: 59.70/55.20
Epoch  32  |  Train/Valid Acc: 60.00/57.10
Epoch  33  |  Train/Valid Acc: 64.72/59.20
Epoch  34  |  Train/Valid Acc: 60.15/54.80
Epoch  35  |  Train/Valid Acc: 60.73/55.60
Epoch  36  |  Train/Valid Acc: 64.78/59.90
Epoch  37  |  Train/Valid Acc: 64.26/60.10
Epoch  38  |  Train/Valid Acc: 65.21/59.50
Epoch  39  |  Train/Valid Acc: 64.77/56.70
Epoch  40  |  Train/Valid Acc: 63.09/55.20
Epoch  41  |  Train/Valid Acc: 66.14/60.90
Epoch  42  |  Train/Valid Acc: 67.17/62.50
Epoch  43  |  Train/Valid Acc: 60.02/54.30
Epoch  44  |  Train/Valid Acc: 64.91/58.80
Epoch  45  |  Train/Valid Acc: 67.85/60.30
Epoch  46  |  Train/Valid Acc: 64.30/58.10
Epoch  47  |  Train/Valid Acc: 64.16/58.30
Epoch  48  |  Train/Valid Acc: 68.18/62.10
Epoch  49  |  Train/Valid Acc: 62.60/58.40
Epoch  50  |  Train/Valid Acc: 66.92/60.50
Epoch  51  |  Train/Valid Acc: 63.54/57.20
Epoch  52  |  Train/Valid Acc: 68.29/61.20
Epoch  53  |  Train/Valid Acc: 67.00/61.30
Epoch  54  |  Train/Valid Acc: 66.01/60.70
Epoch  55  |  Train/Valid Acc: 66.48/61.50
Epoch  56  |  Train/Valid Acc: 61.47/56.10
Epoch  57  |  Train/Valid Acc: 66.07/60.30
Epoch  58  |  Train/Valid Acc: 67.73/59.40
Epoch  59  |  Train/Valid Acc: 62.46/58.90
Epoch  60  |  Train/Valid Acc: 66.94/61.40
Epoch  61  |  Train/Valid Acc: 69.32/64.10
Epoch  62  |  Train/Valid Acc: 57.43/54.50
Epoch  63  |  Train/Valid Acc: 67.81/60.60
Epoch  64  |  Train/Valid Acc: 64.76/59.30
Epoch  65  |  Train/Valid Acc: 69.20/62.60
Epoch  66  |  Train/Valid Acc: 66.62/61.40
Epoch  67  |  Train/Valid Acc: 64.40/60.00
Epoch  68  |  Train/Valid Acc: 66.40/62.20
Epoch  69  |  Train/Valid Acc: 68.49/62.80
Epoch  70  |  Train/Valid Acc: 66.81/61.30
Epoch  71  |  Train/Valid Acc: 67.85/62.10
Epoch  72  |  Train/Valid Acc: 68.71/62.10
Epoch  73  |  Train/Valid Acc: 66.94/61.90
Epoch  74  |  Train/Valid Acc: 69.00/62.30
Epoch  75  |  Train/Valid Acc: 65.22/61.90
Epoch  76  |  Train/Valid Acc: 65.86/60.70
Epoch  77  |  Train/Valid Acc: 70.45/62.80
Epoch  78  |  Train/Valid Acc: 63.32/57.10
Epoch  79  |  Train/Valid Acc: 68.13/59.80
Epoch  80  |  Train/Valid Acc: 69.84/64.40
Epoch  81  |  Train/Valid Acc: 69.26/63.70
Epoch  82  |  Train/Valid Acc: 66.01/61.60
Epoch  83  |  Train/Valid Acc: 70.93/65.30
Epoch  84  |  Train/Valid Acc: 69.66/62.10
Epoch  85  |  Train/Valid Acc: 65.53/61.50
Epoch  86  |  Train/Valid Acc: 67.92/60.20
Epoch  87  |  Train/Valid Acc: 67.67/63.10
Epoch  88  |  Train/Valid Acc: 64.33/59.40
Epoch  89  |  Train/Valid Acc: 66.37/58.60
Epoch  90  |  Train/Valid Acc: 63.32/56.20
Epoch  91  |  Train/Valid Acc: 67.35/61.30
Epoch  92  |  Train/Valid Acc: 69.12/62.00
Epoch  93  |  Train/Valid Acc: 69.93/62.90
Epoch  94  |  Train/Valid Acc: 66.52/60.70
Epoch  95  |  Train/Valid Acc: 69.41/61.80
Epoch  96  |  Train/Valid Acc: 67.85/62.50
Epoch  97  |  Train/Valid Acc: 70.32/63.60
Epoch  98  |  Train/Valid Acc: 69.32/62.90
Epoch  99  |  Train/Valid Acc: 68.90/60.30
Epoch 100  |  Train/Valid Acc: 69.61/61.80
Epoch 101  |  Train/Valid Acc: 67.63/62.10
Epoch 102  |  Train/Valid Acc: 68.18/61.60
Epoch 103  |  Train/Valid Acc: 71.05/62.40
Epoch 104  |  Train/Valid Acc: 71.27/63.30
Epoch 105  |  Train/Valid Acc: 67.66/62.30
Epoch 106  |  Train/Valid Acc: 70.37/61.60
Epoch 107  |  Train/Valid Acc: 65.84/63.10
Epoch 108  |  Train/Valid Acc: 72.44/63.80
Epoch 109  |  Train/Valid Acc: 71.44/62.50
Epoch 110  |  Train/Valid Acc: 69.01/61.90
Epoch 111  |  Train/Valid Acc: 69.38/60.80
Epoch 112  |  Train/Valid Acc: 70.99/65.00
Epoch 113  |  Train/Valid Acc: 67.42/59.40
Epoch 114  |  Train/Valid Acc: 68.88/61.40
Epoch 115  |  Train/Valid Acc: 70.59/61.90
Epoch 116  |  Train/Valid Acc: 64.71/58.50
Epoch 117  |  Train/Valid Acc: 67.19/61.20
Epoch 118  |  Train/Valid Acc: 68.88/61.70
Epoch 119  |  Train/Valid Acc: 69.34/62.70
Epoch 120  |  Train/Valid Acc: 66.37/62.10
Epoch 121  |  Train/Valid Acc: 66.52/59.90
Epoch 122  |  Train/Valid Acc: 69.42/61.30
Epoch 123  |  Train/Valid Acc: 51.75/47.70
Epoch 124  |  Train/Valid Acc: 70.67/62.90
Epoch 125  |  Train/Valid Acc: 71.86/63.10
Epoch 126  |  Train/Valid Acc: 71.21/63.20
Epoch 127  |  Train/Valid Acc: 72.21/63.60
Epoch 128  |  Train/Valid Acc: 69.04/62.30
Epoch 129  |  Train/Valid Acc: 67.66/59.60
Epoch 130  |  Train/Valid Acc: 69.09/61.50
Epoch 131  |  Train/Valid Acc: 64.01/57.00
Epoch 132  |  Train/Valid Acc: 69.79/61.50
Epoch 133  |  Train/Valid Acc: 66.73/60.80
Epoch 134  |  Train/Valid Acc: 65.47/57.60
Epoch 135  |  Train/Valid Acc: 68.09/59.90
Epoch 136  |  Train/Valid Acc: 64.38/58.50
Epoch 137  |  Train/Valid Acc: 70.52/61.70
Epoch 138  |  Train/Valid Acc: 68.28/61.60
Epoch 139  |  Train/Valid Acc: 67.66/60.60
Epoch 140  |  Train/Valid Acc: 70.20/62.20
Epoch 141  |  Train/Valid Acc: 71.66/63.10
Epoch 142  |  Train/Valid Acc: 64.78/57.40
Epoch 143  |  Train/Valid Acc: 63.49/56.70
Epoch 144  |  Train/Valid Acc: 72.68/63.30
Epoch 145  |  Train/Valid Acc: 70.93/62.50
Epoch 146  |  Train/Valid Acc: 70.51/61.90
Epoch 147  |  Train/Valid Acc: 72.00/61.60
Epoch 148  |  Train/Valid Acc: 69.59/60.70
Epoch 149  |  Train/Valid Acc: 71.24/60.70
Epoch 150  |  Train/Valid Acc: 69.91/59.60

In [15]:
plt.plot(collect['epoch'], collect['train_acc'], label='train_acc')
plt.plot(collect['epoch'], collect['val_acc'], label='val_acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

plt.plot(collect['epoch'], collect['cost'])
plt.xlabel('Epoch')
plt.ylabel('Avg. Cost Per Epoch')
plt.show()

In [16]:
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))

Test accuracy: 61.45%


Train with Cyclical Learning Rate (triangular2)¶

In [17]:
#################################
### Setting for this run
#################################

num_epochs = 150
base_lr = 0.09
max_lr = 0.175

#################################
### Init Model
#################################

torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)
model = model.to(device)

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=base_lr)

########################################################################
# Collect the data to be evaluated via the LR Range Test
collect = {'epoch': [], 'cost': [], 'train_acc': [], 'val_acc': []}
########################################################################

start_time = time.time()
for epoch in range(num_epochs):
epoch_avg_cost = 0.
model = model.train()
for batch_idx, (features, targets) in enumerate(train_loader):

features = features.to(device)
targets = targets.to(device)

### FORWARD AND BACK PROP
logits, probas = model(features)
cost = cost_fn(logits, targets)

cost.backward()

### UPDATE MODEL PARAMETERS
optimizer.step()

epoch_avg_cost += cost

#############################################
# Logging
if not batch_step % 600:
print('Batch %5d/%d' % (batch_step, iter_per_ep*num_epochs),
end='')
print('   Cost: %.5f' % cost)

#############################################
# Collect stats
model = model.eval()
epoch_avg_cost /= batch_idx+1
collect['epoch'].append(epoch+1)
collect['val_acc'].append(val_acc)
collect['train_acc'].append(train_acc)
collect['cost'].append(epoch_avg_cost / iter_per_ep)

################################################
# Logging
print('Epoch %3d' % (epoch+1), end='')
print('  |  Train/Valid Acc: %.2f/%.2f' % (train_acc, val_acc))

#############################################
# update learning rate
base_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=num_epochs*iter_per_ep,
base_lr=base_lr,
max_lr=max_lr,
mode='triangular2')
for g in optimizer.param_groups:
g['lr'] = base_lr
############################################

print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))

print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))

plt.plot(collect['epoch'], collect['train_acc'], label='train_acc')
plt.plot(collect['epoch'], collect['val_acc'], label='val_acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

plt.plot(collect['epoch'], collect['cost'])
plt.xlabel('Epoch')
plt.ylabel('Avg. Cost Per Epoch')
plt.show()

Epoch   1  |  Train/Valid Acc: 10.10/11.50
Epoch   2  |  Train/Valid Acc: 11.65/11.70
Epoch   3  |  Train/Valid Acc: 25.36/23.20
Epoch   4  |  Train/Valid Acc: 23.71/25.20
Epoch   5  |  Train/Valid Acc: 39.76/39.30
Epoch   6  |  Train/Valid Acc: 41.19/38.80
Epoch   7  |  Train/Valid Acc: 40.50/37.70
Epoch   8  |  Train/Valid Acc: 44.18/41.50
Epoch   9  |  Train/Valid Acc: 49.93/47.00
Epoch  10  |  Train/Valid Acc: 51.51/49.00
Epoch  11  |  Train/Valid Acc: 52.48/49.50
Epoch  12  |  Train/Valid Acc: 51.53/50.80
Epoch  13  |  Train/Valid Acc: 54.98/52.80
Epoch  14  |  Train/Valid Acc: 47.98/46.50
Epoch  15  |  Train/Valid Acc: 57.34/54.80
Epoch  16  |  Train/Valid Acc: 57.44/54.20
Epoch  17  |  Train/Valid Acc: 55.66/53.80
Epoch  18  |  Train/Valid Acc: 59.47/55.10
Epoch  19  |  Train/Valid Acc: 58.14/55.20
Epoch  20  |  Train/Valid Acc: 58.70/55.30
Epoch  21  |  Train/Valid Acc: 58.28/54.00
Epoch  22  |  Train/Valid Acc: 61.68/58.00
Epoch  23  |  Train/Valid Acc: 59.49/56.80
Epoch  24  |  Train/Valid Acc: 61.95/56.70
Epoch  25  |  Train/Valid Acc: 58.73/54.00
Epoch  26  |  Train/Valid Acc: 58.40/53.50
Epoch  27  |  Train/Valid Acc: 59.80/56.90
Epoch  28  |  Train/Valid Acc: 61.06/56.60
Epoch  29  |  Train/Valid Acc: 59.48/55.10
Epoch  30  |  Train/Valid Acc: 63.96/59.20
Epoch  31  |  Train/Valid Acc: 61.43/56.90
Epoch  32  |  Train/Valid Acc: 64.08/59.30
Epoch  33  |  Train/Valid Acc: 64.98/59.70
Epoch  34  |  Train/Valid Acc: 56.01/51.10
Epoch  35  |  Train/Valid Acc: 64.41/59.80
Epoch  36  |  Train/Valid Acc: 64.44/60.50
Epoch  37  |  Train/Valid Acc: 64.06/59.20
Epoch  38  |  Train/Valid Acc: 63.14/58.60
Epoch  39  |  Train/Valid Acc: 63.49/61.10
Epoch  40  |  Train/Valid Acc: 65.99/60.90
Epoch  41  |  Train/Valid Acc: 64.38/57.70
Epoch  42  |  Train/Valid Acc: 63.68/59.10
Epoch  43  |  Train/Valid Acc: 63.98/58.10
Epoch  44  |  Train/Valid Acc: 65.07/59.50
Epoch  45  |  Train/Valid Acc: 64.54/60.70
Epoch  46  |  Train/Valid Acc: 66.89/62.10
Epoch  47  |  Train/Valid Acc: 63.51/59.10
Epoch  48  |  Train/Valid Acc: 66.54/61.20
Epoch  49  |  Train/Valid Acc: 66.38/60.70
Epoch  50  |  Train/Valid Acc: 66.54/60.20
Epoch  51  |  Train/Valid Acc: 65.31/58.40
Epoch  52  |  Train/Valid Acc: 64.59/60.60
Epoch  53  |  Train/Valid Acc: 66.46/60.70
Epoch  54  |  Train/Valid Acc: 61.76/57.50
Epoch  55  |  Train/Valid Acc: 67.43/62.70
Epoch  56  |  Train/Valid Acc: 66.12/61.10
Epoch  57  |  Train/Valid Acc: 66.77/61.10
Epoch  58  |  Train/Valid Acc: 67.22/60.30
Epoch  59  |  Train/Valid Acc: 66.94/62.30
Epoch  60  |  Train/Valid Acc: 68.32/61.60
Epoch  61  |  Train/Valid Acc: 67.54/62.10
Epoch  62  |  Train/Valid Acc: 65.06/61.10
Epoch  63  |  Train/Valid Acc: 63.50/59.20
Epoch  64  |  Train/Valid Acc: 66.59/60.80
Epoch  65  |  Train/Valid Acc: 69.35/60.90
Epoch  66  |  Train/Valid Acc: 67.44/62.80
Epoch  67  |  Train/Valid Acc: 67.28/62.50
Epoch  68  |  Train/Valid Acc: 68.13/62.00
Epoch  69  |  Train/Valid Acc: 65.02/60.00
Epoch  70  |  Train/Valid Acc: 68.35/63.00
Epoch  71  |  Train/Valid Acc: 61.92/57.70
Epoch  72  |  Train/Valid Acc: 68.02/62.10
Epoch  73  |  Train/Valid Acc: 67.62/61.50
Epoch  74  |  Train/Valid Acc: 68.39/61.50
Epoch  75  |  Train/Valid Acc: 66.11/60.70
Epoch  76  |  Train/Valid Acc: 60.31/56.90
Epoch  77  |  Train/Valid Acc: 67.06/61.60
Epoch  78  |  Train/Valid Acc: 66.69/61.00
Epoch  79  |  Train/Valid Acc: 68.88/62.70
Epoch  80  |  Train/Valid Acc: 53.01/48.80
Epoch  81  |  Train/Valid Acc: 70.58/63.00
Epoch  82  |  Train/Valid Acc: 66.57/60.00
Epoch  83  |  Train/Valid Acc: 62.87/57.30
Epoch  84  |  Train/Valid Acc: 69.49/61.50
Epoch  85  |  Train/Valid Acc: 66.03/60.40
Epoch  86  |  Train/Valid Acc: 68.34/63.10
Epoch  87  |  Train/Valid Acc: 69.02/60.90
Epoch  88  |  Train/Valid Acc: 65.63/60.30
Epoch  89  |  Train/Valid Acc: 62.16/56.80
Epoch  90  |  Train/Valid Acc: 58.92/56.50
Epoch  91  |  Train/Valid Acc: 70.52/63.90
Epoch  92  |  Train/Valid Acc: 69.29/62.90
Epoch  93  |  Train/Valid Acc: 69.67/62.70
Epoch  94  |  Train/Valid Acc: 69.38/62.00
Epoch  95  |  Train/Valid Acc: 68.55/62.00
Epoch  96  |  Train/Valid Acc: 69.87/63.00
Epoch  97  |  Train/Valid Acc: 67.04/60.20
Epoch  98  |  Train/Valid Acc: 64.95/60.00
Epoch  99  |  Train/Valid Acc: 67.18/61.30
Epoch 100  |  Train/Valid Acc: 69.53/60.40
Epoch 101  |  Train/Valid Acc: 68.25/62.00
Epoch 102  |  Train/Valid Acc: 66.14/60.40
Epoch 103  |  Train/Valid Acc: 70.77/63.10
Epoch 104  |  Train/Valid Acc: 66.72/58.20
Epoch 105  |  Train/Valid Acc: 67.28/61.10
Epoch 106  |  Train/Valid Acc: 69.26/61.80
Epoch 107  |  Train/Valid Acc: 70.49/63.00
Epoch 108  |  Train/Valid Acc: 68.44/61.50
Epoch 109  |  Train/Valid Acc: 69.62/62.20
Epoch 110  |  Train/Valid Acc: 65.51/57.40
Epoch 111  |  Train/Valid Acc: 67.62/60.60
Epoch 112  |  Train/Valid Acc: 68.98/62.60
Epoch 113  |  Train/Valid Acc: 67.35/61.70
Epoch 114  |  Train/Valid Acc: 63.91/57.20
Epoch 115  |  Train/Valid Acc: 69.48/59.50
Epoch 116  |  Train/Valid Acc: 67.54/60.90
Epoch 117  |  Train/Valid Acc: 64.29/58.20
Epoch 118  |  Train/Valid Acc: 68.95/61.50
Epoch 119  |  Train/Valid Acc: 69.82/60.40
Epoch 120  |  Train/Valid Acc: 68.28/60.70
Epoch 121  |  Train/Valid Acc: 67.40/58.80
Epoch 122  |  Train/Valid Acc: 68.32/61.40
Epoch 123  |  Train/Valid Acc: 71.35/61.70
Epoch 124  |  Train/Valid Acc: 69.96/60.80
Epoch 125  |  Train/Valid Acc: 69.93/61.90
Epoch 126  |  Train/Valid Acc: 70.48/61.20
Epoch 127  |  Train/Valid Acc: 65.93/58.40
Epoch 128  |  Train/Valid Acc: 66.86/61.10
Epoch 129  |  Train/Valid Acc: 69.40/60.50
Epoch 130  |  Train/Valid Acc: 71.33/61.00
Epoch 131  |  Train/Valid Acc: 70.79/61.50
Epoch 132  |  Train/Valid Acc: 67.92/60.80
Epoch 133  |  Train/Valid Acc: 68.64/61.50
Epoch 134  |  Train/Valid Acc: 65.79/59.10
Epoch 135  |  Train/Valid Acc: 69.58/62.90
Epoch 136  |  Train/Valid Acc: 69.36/62.00
Epoch 137  |  Train/Valid Acc: 65.36/61.50
Epoch 138  |  Train/Valid Acc: 67.90/60.50
Epoch 139  |  Train/Valid Acc: 66.31/58.10
Epoch 140  |  Train/Valid Acc: 71.86/63.60
Epoch 141  |  Train/Valid Acc: 63.20/58.50
Epoch 142  |  Train/Valid Acc: 68.61/59.60
Epoch 143  |  Train/Valid Acc: 68.63/60.70
Epoch 144  |  Train/Valid Acc: 69.86/61.70
Epoch 145  |  Train/Valid Acc: 65.65/60.60
Epoch 146  |  Train/Valid Acc: 69.74/61.10
Epoch 147  |  Train/Valid Acc: 68.24/59.10
Epoch 148  |  Train/Valid Acc: 66.41/58.70
Epoch 149  |  Train/Valid Acc: 63.01/57.50
Epoch 150  |  Train/Valid Acc: 70.22/62.70

In [18]:
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))

Test accuracy: 61.69%


Train with Cyclical Learning Rate (exp_range)¶

In [19]:
#################################
### Setting for this run
#################################

num_epochs = 150
base_lr = 0.09
max_lr = 0.175

#################################
### Init Model
#################################

torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)
model = model.to(device)

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=base_lr)

########################################################################
# Collect the data to be evaluated via the LR Range Test
collect = {'epoch': [], 'cost': [], 'train_acc': [], 'val_acc': []}
########################################################################

start_time = time.time()
for epoch in range(num_epochs):
epoch_avg_cost = 0.
model = model.train()
for batch_idx, (features, targets) in enumerate(train_loader):

features = features.to(device)
targets = targets.to(device)

### FORWARD AND BACK PROP
logits, probas = model(features)
cost = cost_fn(logits, targets)

cost.backward()

### UPDATE MODEL PARAMETERS
optimizer.step()

epoch_avg_cost += cost

#############################################
# Logging
if not batch_step % 600:
print('Batch %5d/%d' % (batch_step, iter_per_ep*num_epochs),
end='')
print('   Cost: %.5f' % cost)

#############################################
# Collect stats
model = model.eval()
epoch_avg_cost /= batch_idx+1
collect['epoch'].append(epoch+1)
collect['val_acc'].append(val_acc)
collect['train_acc'].append(train_acc)
collect['cost'].append(epoch_avg_cost / iter_per_ep)

################################################
# Logging
print('Epoch %3d' % (epoch+1), end='')
print('  |  Train/Valid Acc: %.2f/%.2f' % (train_acc, val_acc))

#############################################
# update learning rate
base_lr = cyclical_learning_rate(batch_step=batch_step,
step_size=num_epochs*iter_per_ep,
base_lr=base_lr,
max_lr=max_lr,
mode='exp_range')
for g in optimizer.param_groups:
g['lr'] = base_lr
############################################

print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))

print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))

plt.plot(collect['epoch'], collect['train_acc'], label='train_acc')
plt.plot(collect['epoch'], collect['val_acc'], label='val_acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

plt.plot(collect['epoch'], collect['cost'])
plt.xlabel('Epoch')
plt.ylabel('Avg. Cost Per Epoch')
plt.show()

Epoch   1  |  Train/Valid Acc: 10.17/11.60
Epoch   2  |  Train/Valid Acc: 11.49/11.70
Epoch   3  |  Train/Valid Acc: 25.78/23.30
Epoch   4  |  Train/Valid Acc: 26.36/27.90
Epoch   5  |  Train/Valid Acc: 39.37/39.00
Epoch   6  |  Train/Valid Acc: 42.89/40.80
Epoch   7  |  Train/Valid Acc: 39.37/36.00
Epoch   8  |  Train/Valid Acc: 45.47/44.20
Epoch   9  |  Train/Valid Acc: 50.02/48.40
Epoch  10  |  Train/Valid Acc: 50.26/47.70
Epoch  11  |  Train/Valid Acc: 51.07/50.40
Epoch  12  |  Train/Valid Acc: 52.54/51.00
Epoch  13  |  Train/Valid Acc: 56.30/54.70
Epoch  14  |  Train/Valid Acc: 53.49/51.60
Epoch  15  |  Train/Valid Acc: 57.27/53.60
Epoch  16  |  Train/Valid Acc: 54.77/52.10
Epoch  17  |  Train/Valid Acc: 54.57/50.60
Epoch  18  |  Train/Valid Acc: 56.87/52.60
Epoch  19  |  Train/Valid Acc: 60.48/58.10
Epoch  20  |  Train/Valid Acc: 57.57/55.80
Epoch  21  |  Train/Valid Acc: 59.53/55.70
Epoch  22  |  Train/Valid Acc: 61.76/58.70
Epoch  23  |  Train/Valid Acc: 56.61/54.20
Epoch  24  |  Train/Valid Acc: 60.27/58.40
Epoch  25  |  Train/Valid Acc: 62.83/58.20
Epoch  26  |  Train/Valid Acc: 63.13/58.80
Epoch  27  |  Train/Valid Acc: 51.37/47.80
Epoch  28  |  Train/Valid Acc: 63.18/59.70
Epoch  29  |  Train/Valid Acc: 63.26/58.50
Epoch  30  |  Train/Valid Acc: 59.02/57.40
Epoch  31  |  Train/Valid Acc: 62.49/58.00
Epoch  32  |  Train/Valid Acc: 64.12/57.00
Epoch  33  |  Train/Valid Acc: 64.96/59.50
Epoch  34  |  Train/Valid Acc: 59.77/55.10
Epoch  35  |  Train/Valid Acc: 64.57/59.50
Epoch  36  |  Train/Valid Acc: 61.37/58.30
Epoch  37  |  Train/Valid Acc: 65.78/61.20
Epoch  38  |  Train/Valid Acc: 64.57/59.20
Epoch  39  |  Train/Valid Acc: 65.71/60.10
Epoch  40  |  Train/Valid Acc: 64.29/60.10
Epoch  41  |  Train/Valid Acc: 63.83/58.10
Epoch  42  |  Train/Valid Acc: 67.49/59.90
Epoch  43  |  Train/Valid Acc: 58.37/55.30
Epoch  44  |  Train/Valid Acc: 66.92/62.40
Epoch  45  |  Train/Valid Acc: 66.80/62.10
Epoch  46  |  Train/Valid Acc: 64.27/56.40
Epoch  47  |  Train/Valid Acc: 61.46/57.00
Epoch  48  |  Train/Valid Acc: 67.49/60.00
Epoch  49  |  Train/Valid Acc: 66.78/60.60
Epoch  50  |  Train/Valid Acc: 67.60/61.40
Epoch  51  |  Train/Valid Acc: 61.56/56.70
Epoch  52  |  Train/Valid Acc: 68.42/60.00
Epoch  53  |  Train/Valid Acc: 65.98/60.40
Epoch  54  |  Train/Valid Acc: 64.53/59.60
Epoch  55  |  Train/Valid Acc: 67.92/61.50
Epoch  56  |  Train/Valid Acc: 66.49/59.70
Epoch  57  |  Train/Valid Acc: 60.37/54.90
Epoch  58  |  Train/Valid Acc: 66.04/60.20
Epoch  59  |  Train/Valid Acc: 64.52/58.00
Epoch  60  |  Train/Valid Acc: 69.13/62.10
Epoch  61  |  Train/Valid Acc: 65.04/59.70
Epoch  62  |  Train/Valid Acc: 68.39/60.10
Epoch  63  |  Train/Valid Acc: 64.84/58.80
Epoch  64  |  Train/Valid Acc: 68.32/60.50
Epoch  65  |  Train/Valid Acc: 68.29/62.70
Epoch  66  |  Train/Valid Acc: 67.53/60.50
Epoch  67  |  Train/Valid Acc: 68.81/63.40
Epoch  68  |  Train/Valid Acc: 69.23/61.50
Epoch  69  |  Train/Valid Acc: 66.75/61.90
Epoch  70  |  Train/Valid Acc: 64.38/57.70
Epoch  71  |  Train/Valid Acc: 68.63/62.30
Epoch  72  |  Train/Valid Acc: 68.43/62.80
Epoch  73  |  Train/Valid Acc: 70.29/63.00
Epoch  74  |  Train/Valid Acc: 67.50/60.50
Epoch  75  |  Train/Valid Acc: 67.02/61.50
Epoch  76  |  Train/Valid Acc: 65.49/58.10
Epoch  77  |  Train/Valid Acc: 70.99/62.70
Epoch  78  |  Train/Valid Acc: 67.99/61.40
Epoch  79  |  Train/Valid Acc: 70.69/63.90
Epoch  80  |  Train/Valid Acc: 67.63/62.20
Epoch  81  |  Train/Valid Acc: 71.34/62.70
Epoch  82  |  Train/Valid Acc: 68.85/63.10
Epoch  83  |  Train/Valid Acc: 69.36/60.50
Epoch  84  |  Train/Valid Acc: 69.17/61.40
Epoch  85  |  Train/Valid Acc: 68.96/60.90
Epoch  86  |  Train/Valid Acc: 67.52/61.60
Epoch  87  |  Train/Valid Acc: 67.50/60.30
Epoch  88  |  Train/Valid Acc: 64.41/59.60
Epoch  89  |  Train/Valid Acc: 67.42/60.40
Epoch  90  |  Train/Valid Acc: 68.84/63.70
Epoch  91  |  Train/Valid Acc: 69.34/62.00
Epoch  92  |  Train/Valid Acc: 70.38/63.10
Epoch  93  |  Train/Valid Acc: 70.51/63.40
Epoch  94  |  Train/Valid Acc: 67.36/59.90
Epoch  95  |  Train/Valid Acc: 70.43/61.50
Epoch  96  |  Train/Valid Acc: 71.22/62.80
Epoch  97  |  Train/Valid Acc: 66.62/60.40
Epoch  98  |  Train/Valid Acc: 67.72/60.20
Epoch  99  |  Train/Valid Acc: 69.91/62.30
Epoch 100  |  Train/Valid Acc: 67.40/60.50
Epoch 101  |  Train/Valid Acc: 68.86/61.90
Epoch 102  |  Train/Valid Acc: 66.22/61.00
Epoch 103  |  Train/Valid Acc: 63.31/56.20
Epoch 104  |  Train/Valid Acc: 66.99/60.30
Epoch 105  |  Train/Valid Acc: 68.20/63.50
Epoch 106  |  Train/Valid Acc: 62.79/58.40
Epoch 107  |  Train/Valid Acc: 70.71/61.60
Epoch 108  |  Train/Valid Acc: 71.60/61.20
Epoch 109  |  Train/Valid Acc: 71.00/64.50
Epoch 110  |  Train/Valid Acc: 67.55/61.00
Epoch 111  |  Train/Valid Acc: 68.52/61.40
Epoch 112  |  Train/Valid Acc: 65.78/58.70
Epoch 113  |  Train/Valid Acc: 65.15/57.90
Epoch 114  |  Train/Valid Acc: 70.42/61.70
Epoch 115  |  Train/Valid Acc: 70.57/61.50
Epoch 116  |  Train/Valid Acc: 71.08/62.00
Epoch 117  |  Train/Valid Acc: 69.09/62.20
Epoch 118  |  Train/Valid Acc: 70.03/63.00
Epoch 119  |  Train/Valid Acc: 69.64/62.30
Epoch 120  |  Train/Valid Acc: 70.66/64.50
Epoch 121  |  Train/Valid Acc: 70.62/63.70
Epoch 122  |  Train/Valid Acc: 69.05/61.90
Epoch 123  |  Train/Valid Acc: 70.63/60.50
Epoch 124  |  Train/Valid Acc: 70.81/61.90
Epoch 125  |  Train/Valid Acc: 67.99/59.90
Epoch 126  |  Train/Valid Acc: 68.54/61.10
Epoch 127  |  Train/Valid Acc: 70.59/62.40
Epoch 128  |  Train/Valid Acc: 70.12/61.40
Epoch 129  |  Train/Valid Acc: 68.87/59.50
Epoch 130  |  Train/Valid Acc: 66.19/60.00
Epoch 131  |  Train/Valid Acc: 70.92/62.20
Epoch 132  |  Train/Valid Acc: 68.28/59.90
Epoch 133  |  Train/Valid Acc: 67.72/62.00
Epoch 134  |  Train/Valid Acc: 73.66/64.90
Epoch 135  |  Train/Valid Acc: 70.24/59.40
Epoch 136  |  Train/Valid Acc: 70.25/60.40
Epoch 137  |  Train/Valid Acc: 70.21/62.20
Epoch 138  |  Train/Valid Acc: 69.07/60.10
Epoch 139  |  Train/Valid Acc: 68.56/60.40
Epoch 140  |  Train/Valid Acc: 67.09/61.40
Epoch 141  |  Train/Valid Acc: 70.68/60.10
Epoch 142  |  Train/Valid Acc: 65.23/59.60
Epoch 143  |  Train/Valid Acc: 67.44/60.00
Epoch 144  |  Train/Valid Acc: 71.34/64.20
Epoch 145  |  Train/Valid Acc: 69.61/61.50
Epoch 146  |  Train/Valid Acc: 72.49/62.40
Epoch 147  |  Train/Valid Acc: 68.43/59.50
Epoch 148  |  Train/Valid Acc: 54.68/47.80
Epoch 149  |  Train/Valid Acc: 67.08/61.50
Epoch 150  |  Train/Valid Acc: 68.17/59.50

In [20]:
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))

Test accuracy: 59.81%

In [21]:
%watermark -iv

torch       1.0.0
matplotlib  3.0.2
torchvision 0.2.1
numpy       1.15.4