Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka CPython 3.7.3 IPython 7.6.1 torch 1.2.0
Implementing a very basic graph neural network (GNN) using a spectral graph convolution.
Here, the 28x28 image of a digit in MNIST represents the graph, where each pixel (i.e., cell in the grid) represents a particular node. The feature of that node is simply the pixel intensity in range [0, 1].
Here, the adjacency matrix of the pixels is basically just determined by their neighborhood pixels. Using a Gaussian filter, we connect pixels based on their Euclidean distance in the grid.
In the related notebook, ./gnn-basic-1.ipynb, we used this adjacency matrix $A$ to compute the output of a layer as
$$X^{(l+1)}=A X^{(l)} W^{(l)}.$$Here, $A$ is the $N \times N$ adjacency matrix, and $X$ is the $N \times C$ feature matrix (a 2D coordinate array, where $N$ is the total number of pixels -- $28 \times 28 = 784$ in MNIST). $W$ is the weight matrix of shape $N \times P$, where $P$ would represent the number of classes if we have only a single hidden layer.
In this notebook, we modify this code using spectral graph convolution, i.e.,
$$X^{(l+1)}=V\left(V^{T} X^{(l)} \odot V^{T} W_{\text {spectral }}^{(l)}\right).$$Where $V$ are the eigenvectors of the graph Laplacian $L$, which we can compute from the adjacency matrix $A$. Here, $W_{\text {spectral }}$ represents the trainable weights (filters).
import time
import numpy as np
from scipy.spatial.distance import cdist
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
from torch.utils.data.dataset import Subset
if torch.cuda.is_available():
torch.backends.cudnn.deterministic = True
%matplotlib inline
import matplotlib.pyplot as plt
##########################
### SETTINGS
##########################
# Device
DEVICE = torch.device("cuda:3" if torch.cuda.is_available() else "cpu")
# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.05
NUM_EPOCHS = 50
BATCH_SIZE = 128
IMG_SIZE = 28
# Architecture
NUM_CLASSES = 10
train_indices = torch.arange(0, 59000)
valid_indices = torch.arange(59000, 60000)
custom_transform = transforms.Compose([transforms.ToTensor()])
train_and_valid = datasets.MNIST(root='data',
train=True,
transform=custom_transform,
download=True)
test_dataset = datasets.MNIST(root='data',
train=False,
transform=custom_transform,
download=True)
train_dataset = Subset(train_and_valid, train_indices)
valid_dataset = Subset(train_and_valid, valid_indices)
train_loader = DataLoader(dataset=train_dataset,
batch_size=BATCH_SIZE,
num_workers=4,
shuffle=True)
valid_loader = DataLoader(dataset=valid_dataset,
batch_size=BATCH_SIZE,
num_workers=4,
shuffle=False)
test_loader = DataLoader(dataset=test_dataset,
batch_size=BATCH_SIZE,
num_workers=4,
shuffle=False)
# Checking the dataset
for images, labels in train_loader:
print('Image batch dimensions:', images.shape)
print('Image label dimensions:', labels.shape)
break
Image batch dimensions: torch.Size([128, 1, 28, 28]) Image label dimensions: torch.Size([128])
def precompute_adjacency_matrix(img_size):
col, row = np.meshgrid(np.arange(img_size), np.arange(img_size))
# N = img_size^2
# construct 2D coordinate array (shape N x 2) and normalize
# in range [0, 1]
coord = np.stack((col, row), axis=2).reshape(-1, 2) / img_size
# compute pairwise distance matrix (N x N)
dist = cdist(coord, coord, metric='euclidean')
# Apply Gaussian filter
sigma = 0.05 * np.pi
A = np.exp(- dist / sigma ** 2)
A[A < 0.01] = 0
A = torch.from_numpy(A).float()
return A
"""
# Normalization as per (Kipf & Welling, ICLR 2017)
D = A.sum(1) # nodes degree (N,)
D_hat = (D + 1e-5) ** (-0.5)
A_hat = D_hat.view(-1, 1) * A * D_hat.view(1, -1) # N,N
return A_hat
"""
def get_graph_laplacian(A):
# From https://towardsdatascience.com/spectral-graph-convolution-
# explained-and-implemented-step-by-step-2e495b57f801
#
# Computing the graph Laplacian
# A is an adjacency matrix of some graph G
N = A.shape[0] # number of nodes in a graph
D = np.sum(A, 0) # node degrees
D_hat = np.diag((D + 1e-5)**(-0.5)) # normalized node degrees
L = np.identity(N) - np.dot(D_hat, A).dot(D_hat) # Laplacian
return torch.from_numpy(L).float()
A = precompute_adjacency_matrix(28)
plt.imshow(A, vmin=0., vmax=1.)
plt.colorbar()
plt.show()
L = get_graph_laplacian(A.numpy())
plt.imshow(L, vmin=0., vmax=1.)
plt.colorbar()
plt.show()
##########################
### MODEL
##########################
from scipy.sparse.linalg import eigsh
class GraphNet(nn.Module):
def __init__(self, img_size=28, num_filters=2, num_classes=10):
super(GraphNet, self).__init__()
n_rows = img_size**2
self.fc = nn.Linear(n_rows*num_filters, num_classes, bias=False)
A = precompute_adjacency_matrix(img_size)
L = get_graph_laplacian(A.numpy())
Λ,V = eigsh(L.numpy(), k=20, which='SM') # eigen-decomposition (i.e. find Λ,V)
V = torch.from_numpy(V)
# Weight matrix
W_spectral = nn.Parameter(torch.ones((img_size**2, num_filters))).float()
torch.nn.init.kaiming_uniform_(W_spectral)
self.register_buffer('A', A)
self.register_buffer('L', L)
self.register_buffer('V', V)
self.register_buffer('W_spectral', W_spectral)
def forward(self, x):
B = x.size(0) # Batch size
### Reshape eigenvectors
# from [H*W, 20] to [B, H*W, 20]
V_tensor = self.V.unsqueeze(0)
V_tensor = self.V.expand(B, -1, -1)
# from [H*W, 20] to [B, 20, H*W]
V_tensor_T = self.V.T.unsqueeze(0)
V_tensor_T = self.V.T.expand(B, -1, -1)
### Reshape inputs
# [B, C, H, W] => [B, H*W, 1]
x_reshape = x.view(B, -1, 1)
### Reshape spectral weights
# to size [128, H*W, F]
W_spectral_tensor = self.W_spectral.unsqueeze(0)
W_spectral_tensor = self.W_spectral.expand(B, -1, -1)
### Spectral convolution on graphs
# [B, 20, H*W] . [B, H*W, 1] ==> [B, 20, 1]
X_hat = V_tensor_T.bmm(x_reshape) # 20×1 node features in the "spectral" domain
W_hat = V_tensor_T.bmm(W_spectral_tensor) # 20×F filters in the "spectral" domain
Y = V_tensor.bmm(X_hat * W_hat) # N×F result of convolution
### Fully connected
logits = self.fc(Y.reshape(B, -1))
probas = F.softmax(logits, dim=1)
return logits, probas
torch.manual_seed(RANDOM_SEED)
model = GraphNet(img_size=IMG_SIZE, num_classes=NUM_CLASSES)
model = model.to(DEVICE)
optimizer = torch.optim.SGD(model.parameters(), lr=LEARNING_RATE)
def compute_acc(model, data_loader, device):
correct_pred, num_examples = 0, 0
for features, targets in data_loader:
features = features.to(device)
targets = targets.to(device)
logits, probas = model(features)
_, predicted_labels = torch.max(probas, 1)
num_examples += targets.size(0)
correct_pred += (predicted_labels == targets).sum()
return correct_pred.float()/num_examples * 100
start_time = time.time()
cost_list = []
train_acc_list, valid_acc_list = [], []
for epoch in range(NUM_EPOCHS):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
features = features.to(DEVICE)
targets = targets.to(DEVICE)
### FORWARD AND BACK PROP
logits, probas = model(features)
cost = F.cross_entropy(logits, targets)
optimizer.zero_grad()
cost.backward()
### UPDATE MODEL PARAMETERS
optimizer.step()
#################################################
### CODE ONLY FOR LOGGING BEYOND THIS POINT
################################################
cost_list.append(cost.item())
if not batch_idx % 150:
print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
f'Batch {batch_idx:03d}/{len(train_loader):03d} |'
f' Cost: {cost:.4f}')
model.eval()
with torch.set_grad_enabled(False): # save memory during inference
train_acc = compute_acc(model, train_loader, device=DEVICE)
valid_acc = compute_acc(model, valid_loader, device=DEVICE)
print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d}\n'
f'Train ACC: {train_acc:.2f} | Validation ACC: {valid_acc:.2f}')
train_acc_list.append(train_acc)
valid_acc_list.append(valid_acc)
elapsed = (time.time() - start_time)/60
print(f'Time elapsed: {elapsed:.2f} min')
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/050 | Batch 000/461 | Cost: 2.3133 Epoch: 001/050 | Batch 150/461 | Cost: 1.1899 Epoch: 001/050 | Batch 300/461 | Cost: 1.0481 Epoch: 001/050 | Batch 450/461 | Cost: 0.9287 Epoch: 001/050 Train ACC: 73.79 | Validation ACC: 78.10 Time elapsed: 0.07 min Epoch: 002/050 | Batch 000/461 | Cost: 0.8224 Epoch: 002/050 | Batch 150/461 | Cost: 0.9684 Epoch: 002/050 | Batch 300/461 | Cost: 0.6952 Epoch: 002/050 | Batch 450/461 | Cost: 0.8158 Epoch: 002/050 Train ACC: 77.48 | Validation ACC: 82.20 Time elapsed: 0.14 min Epoch: 003/050 | Batch 000/461 | Cost: 0.8203 Epoch: 003/050 | Batch 150/461 | Cost: 0.8409 Epoch: 003/050 | Batch 300/461 | Cost: 0.8602 Epoch: 003/050 | Batch 450/461 | Cost: 0.7012 Epoch: 003/050 Train ACC: 78.55 | Validation ACC: 83.40 Time elapsed: 0.21 min Epoch: 004/050 | Batch 000/461 | Cost: 0.7919 Epoch: 004/050 | Batch 150/461 | Cost: 0.9010 Epoch: 004/050 | Batch 300/461 | Cost: 0.6895 Epoch: 004/050 | Batch 450/461 | Cost: 0.6981 Epoch: 004/050 Train ACC: 79.30 | Validation ACC: 84.10 Time elapsed: 0.28 min Epoch: 005/050 | Batch 000/461 | Cost: 0.6080 Epoch: 005/050 | Batch 150/461 | Cost: 0.6627 Epoch: 005/050 | Batch 300/461 | Cost: 0.7620 Epoch: 005/050 | Batch 450/461 | Cost: 0.8047 Epoch: 005/050 Train ACC: 79.66 | Validation ACC: 84.50 Time elapsed: 0.35 min Epoch: 006/050 | Batch 000/461 | Cost: 0.5992 Epoch: 006/050 | Batch 150/461 | Cost: 0.5546 Epoch: 006/050 | Batch 300/461 | Cost: 0.6459 Epoch: 006/050 | Batch 450/461 | Cost: 0.5968 Epoch: 006/050 Train ACC: 79.91 | Validation ACC: 85.10 Time elapsed: 0.42 min Epoch: 007/050 | Batch 000/461 | Cost: 0.7909 Epoch: 007/050 | Batch 150/461 | Cost: 0.6488 Epoch: 007/050 | Batch 300/461 | Cost: 0.7580 Epoch: 007/050 | Batch 450/461 | Cost: 0.5646 Epoch: 007/050 Train ACC: 80.50 | Validation ACC: 85.00 Time elapsed: 0.48 min Epoch: 008/050 | Batch 000/461 | Cost: 0.6147 Epoch: 008/050 | Batch 150/461 | Cost: 0.6998 Epoch: 008/050 | Batch 300/461 | Cost: 0.5563 Epoch: 008/050 | Batch 450/461 | Cost: 0.5611 Epoch: 008/050 Train ACC: 80.73 | Validation ACC: 85.60 Time elapsed: 0.56 min Epoch: 009/050 | Batch 000/461 | Cost: 0.5629 Epoch: 009/050 | Batch 150/461 | Cost: 0.6245 Epoch: 009/050 | Batch 300/461 | Cost: 0.7393 Epoch: 009/050 | Batch 450/461 | Cost: 0.6670 Epoch: 009/050 Train ACC: 81.09 | Validation ACC: 85.70 Time elapsed: 0.62 min Epoch: 010/050 | Batch 000/461 | Cost: 0.6582 Epoch: 010/050 | Batch 150/461 | Cost: 0.7550 Epoch: 010/050 | Batch 300/461 | Cost: 0.7028 Epoch: 010/050 | Batch 450/461 | Cost: 0.6558 Epoch: 010/050 Train ACC: 81.00 | Validation ACC: 85.70 Time elapsed: 0.69 min Epoch: 011/050 | Batch 000/461 | Cost: 0.5472 Epoch: 011/050 | Batch 150/461 | Cost: 0.6051 Epoch: 011/050 | Batch 300/461 | Cost: 0.5875 Epoch: 011/050 | Batch 450/461 | Cost: 0.4688 Epoch: 011/050 Train ACC: 81.50 | Validation ACC: 85.90 Time elapsed: 0.76 min Epoch: 012/050 | Batch 000/461 | Cost: 0.5227 Epoch: 012/050 | Batch 150/461 | Cost: 0.6252 Epoch: 012/050 | Batch 300/461 | Cost: 0.6359 Epoch: 012/050 | Batch 450/461 | Cost: 0.8590 Epoch: 012/050 Train ACC: 81.61 | Validation ACC: 86.50 Time elapsed: 0.83 min Epoch: 013/050 | Batch 000/461 | Cost: 0.4933 Epoch: 013/050 | Batch 150/461 | Cost: 0.5844 Epoch: 013/050 | Batch 300/461 | Cost: 0.4684 Epoch: 013/050 | Batch 450/461 | Cost: 0.5275 Epoch: 013/050 Train ACC: 81.79 | Validation ACC: 86.50 Time elapsed: 0.90 min Epoch: 014/050 | Batch 000/461 | Cost: 0.6382 Epoch: 014/050 | Batch 150/461 | Cost: 0.7612 Epoch: 014/050 | Batch 300/461 | Cost: 0.5378 Epoch: 014/050 | Batch 450/461 | Cost: 0.5651 Epoch: 014/050 Train ACC: 81.94 | Validation ACC: 86.50 Time elapsed: 0.97 min Epoch: 015/050 | Batch 000/461 | Cost: 0.5122 Epoch: 015/050 | Batch 150/461 | Cost: 0.6347 Epoch: 015/050 | Batch 300/461 | Cost: 0.6239 Epoch: 015/050 | Batch 450/461 | Cost: 0.6026 Epoch: 015/050 Train ACC: 82.01 | Validation ACC: 87.00 Time elapsed: 1.03 min Epoch: 016/050 | Batch 000/461 | Cost: 0.6380 Epoch: 016/050 | Batch 150/461 | Cost: 0.5865 Epoch: 016/050 | Batch 300/461 | Cost: 0.3510 Epoch: 016/050 | Batch 450/461 | Cost: 0.5859 Epoch: 016/050 Train ACC: 82.06 | Validation ACC: 86.50 Time elapsed: 1.10 min Epoch: 017/050 | Batch 000/461 | Cost: 0.6827 Epoch: 017/050 | Batch 150/461 | Cost: 0.6415 Epoch: 017/050 | Batch 300/461 | Cost: 0.7186 Epoch: 017/050 | Batch 450/461 | Cost: 0.6067 Epoch: 017/050 Train ACC: 82.41 | Validation ACC: 87.70 Time elapsed: 1.17 min Epoch: 018/050 | Batch 000/461 | Cost: 0.7209 Epoch: 018/050 | Batch 150/461 | Cost: 0.6981 Epoch: 018/050 | Batch 300/461 | Cost: 0.6810 Epoch: 018/050 | Batch 450/461 | Cost: 0.6180 Epoch: 018/050 Train ACC: 82.55 | Validation ACC: 87.50 Time elapsed: 1.24 min Epoch: 019/050 | Batch 000/461 | Cost: 0.7285 Epoch: 019/050 | Batch 150/461 | Cost: 0.7734 Epoch: 019/050 | Batch 300/461 | Cost: 0.7189 Epoch: 019/050 | Batch 450/461 | Cost: 0.5652 Epoch: 019/050 Train ACC: 82.46 | Validation ACC: 87.30 Time elapsed: 1.31 min Epoch: 020/050 | Batch 000/461 | Cost: 0.7076 Epoch: 020/050 | Batch 150/461 | Cost: 0.4096 Epoch: 020/050 | Batch 300/461 | Cost: 0.7485 Epoch: 020/050 | Batch 450/461 | Cost: 0.7334 Epoch: 020/050 Train ACC: 82.48 | Validation ACC: 87.30 Time elapsed: 1.38 min Epoch: 021/050 | Batch 000/461 | Cost: 0.4686 Epoch: 021/050 | Batch 150/461 | Cost: 0.6241 Epoch: 021/050 | Batch 300/461 | Cost: 0.5736 Epoch: 021/050 | Batch 450/461 | Cost: 0.4948 Epoch: 021/050 Train ACC: 82.67 | Validation ACC: 88.00 Time elapsed: 1.45 min Epoch: 022/050 | Batch 000/461 | Cost: 0.4657 Epoch: 022/050 | Batch 150/461 | Cost: 0.6718 Epoch: 022/050 | Batch 300/461 | Cost: 0.6647 Epoch: 022/050 | Batch 450/461 | Cost: 0.4913 Epoch: 022/050 Train ACC: 82.87 | Validation ACC: 87.90 Time elapsed: 1.52 min Epoch: 023/050 | Batch 000/461 | Cost: 0.5567 Epoch: 023/050 | Batch 150/461 | Cost: 0.4976 Epoch: 023/050 | Batch 300/461 | Cost: 0.5911 Epoch: 023/050 | Batch 450/461 | Cost: 0.4014 Epoch: 023/050 Train ACC: 82.91 | Validation ACC: 87.80 Time elapsed: 1.59 min Epoch: 024/050 | Batch 000/461 | Cost: 0.5728 Epoch: 024/050 | Batch 150/461 | Cost: 0.6313 Epoch: 024/050 | Batch 300/461 | Cost: 0.5825 Epoch: 024/050 | Batch 450/461 | Cost: 0.4720 Epoch: 024/050 Train ACC: 83.00 | Validation ACC: 87.90 Time elapsed: 1.66 min Epoch: 025/050 | Batch 000/461 | Cost: 0.5128 Epoch: 025/050 | Batch 150/461 | Cost: 0.4793 Epoch: 025/050 | Batch 300/461 | Cost: 0.7191 Epoch: 025/050 | Batch 450/461 | Cost: 0.5402 Epoch: 025/050 Train ACC: 83.12 | Validation ACC: 88.30 Time elapsed: 1.72 min Epoch: 026/050 | Batch 000/461 | Cost: 0.4961 Epoch: 026/050 | Batch 150/461 | Cost: 0.4546 Epoch: 026/050 | Batch 300/461 | Cost: 0.5333 Epoch: 026/050 | Batch 450/461 | Cost: 0.5073 Epoch: 026/050 Train ACC: 82.98 | Validation ACC: 87.90 Time elapsed: 1.79 min Epoch: 027/050 | Batch 000/461 | Cost: 0.7034 Epoch: 027/050 | Batch 150/461 | Cost: 0.5373 Epoch: 027/050 | Batch 300/461 | Cost: 0.5158 Epoch: 027/050 | Batch 450/461 | Cost: 0.5705 Epoch: 027/050 Train ACC: 83.15 | Validation ACC: 88.00 Time elapsed: 1.86 min Epoch: 028/050 | Batch 000/461 | Cost: 0.4614 Epoch: 028/050 | Batch 150/461 | Cost: 0.4124 Epoch: 028/050 | Batch 300/461 | Cost: 0.7368 Epoch: 028/050 | Batch 450/461 | Cost: 0.5744 Epoch: 028/050 Train ACC: 82.85 | Validation ACC: 87.60 Time elapsed: 1.93 min Epoch: 029/050 | Batch 000/461 | Cost: 0.5026 Epoch: 029/050 | Batch 150/461 | Cost: 0.6048 Epoch: 029/050 | Batch 300/461 | Cost: 0.6400 Epoch: 029/050 | Batch 450/461 | Cost: 0.4906 Epoch: 029/050 Train ACC: 83.26 | Validation ACC: 88.10 Time elapsed: 2.00 min Epoch: 030/050 | Batch 000/461 | Cost: 0.6298 Epoch: 030/050 | Batch 150/461 | Cost: 0.5472 Epoch: 030/050 | Batch 300/461 | Cost: 0.5469 Epoch: 030/050 | Batch 450/461 | Cost: 0.4819 Epoch: 030/050 Train ACC: 83.30 | Validation ACC: 88.70 Time elapsed: 2.07 min Epoch: 031/050 | Batch 000/461 | Cost: 0.6101 Epoch: 031/050 | Batch 150/461 | Cost: 0.5150 Epoch: 031/050 | Batch 300/461 | Cost: 0.5505 Epoch: 031/050 | Batch 450/461 | Cost: 0.5634 Epoch: 031/050 Train ACC: 83.28 | Validation ACC: 88.60 Time elapsed: 2.13 min Epoch: 032/050 | Batch 000/461 | Cost: 0.5655 Epoch: 032/050 | Batch 150/461 | Cost: 0.6567 Epoch: 032/050 | Batch 300/461 | Cost: 0.5758 Epoch: 032/050 | Batch 450/461 | Cost: 0.5306 Epoch: 032/050 Train ACC: 83.31 | Validation ACC: 88.20 Time elapsed: 2.20 min Epoch: 033/050 | Batch 000/461 | Cost: 0.6677 Epoch: 033/050 | Batch 150/461 | Cost: 0.7450 Epoch: 033/050 | Batch 300/461 | Cost: 0.5538 Epoch: 033/050 | Batch 450/461 | Cost: 0.5642 Epoch: 033/050 Train ACC: 83.33 | Validation ACC: 88.40 Time elapsed: 2.27 min Epoch: 034/050 | Batch 000/461 | Cost: 0.6287 Epoch: 034/050 | Batch 150/461 | Cost: 0.4752 Epoch: 034/050 | Batch 300/461 | Cost: 0.5957 Epoch: 034/050 | Batch 450/461 | Cost: 0.4531 Epoch: 034/050 Train ACC: 83.50 | Validation ACC: 88.70 Time elapsed: 2.34 min Epoch: 035/050 | Batch 000/461 | Cost: 0.5368 Epoch: 035/050 | Batch 150/461 | Cost: 0.5658 Epoch: 035/050 | Batch 300/461 | Cost: 0.6598 Epoch: 035/050 | Batch 450/461 | Cost: 0.5858 Epoch: 035/050 Train ACC: 83.59 | Validation ACC: 88.50 Time elapsed: 2.41 min Epoch: 036/050 | Batch 000/461 | Cost: 0.5557 Epoch: 036/050 | Batch 150/461 | Cost: 0.4680 Epoch: 036/050 | Batch 300/461 | Cost: 0.4905 Epoch: 036/050 | Batch 450/461 | Cost: 0.9074 Epoch: 036/050 Train ACC: 83.67 | Validation ACC: 88.50 Time elapsed: 2.48 min Epoch: 037/050 | Batch 000/461 | Cost: 0.6120 Epoch: 037/050 | Batch 150/461 | Cost: 0.4668 Epoch: 037/050 | Batch 300/461 | Cost: 0.5836 Epoch: 037/050 | Batch 450/461 | Cost: 0.4536 Epoch: 037/050 Train ACC: 83.35 | Validation ACC: 88.80 Time elapsed: 2.55 min Epoch: 038/050 | Batch 000/461 | Cost: 0.5380 Epoch: 038/050 | Batch 150/461 | Cost: 0.4491 Epoch: 038/050 | Batch 300/461 | Cost: 0.4500 Epoch: 038/050 | Batch 450/461 | Cost: 0.6041 Epoch: 038/050 Train ACC: 83.69 | Validation ACC: 88.80 Time elapsed: 2.61 min Epoch: 039/050 | Batch 000/461 | Cost: 0.4863 Epoch: 039/050 | Batch 150/461 | Cost: 0.5673 Epoch: 039/050 | Batch 300/461 | Cost: 0.4037 Epoch: 039/050 | Batch 450/461 | Cost: 0.6392 Epoch: 039/050 Train ACC: 83.71 | Validation ACC: 88.70 Time elapsed: 2.68 min Epoch: 040/050 | Batch 000/461 | Cost: 0.6707 Epoch: 040/050 | Batch 150/461 | Cost: 0.5601 Epoch: 040/050 | Batch 300/461 | Cost: 0.5265 Epoch: 040/050 | Batch 450/461 | Cost: 0.4867 Epoch: 040/050 Train ACC: 83.76 | Validation ACC: 88.90 Time elapsed: 2.75 min Epoch: 041/050 | Batch 000/461 | Cost: 0.5379 Epoch: 041/050 | Batch 150/461 | Cost: 0.4588 Epoch: 041/050 | Batch 300/461 | Cost: 0.5684 Epoch: 041/050 | Batch 450/461 | Cost: 0.5547 Epoch: 041/050 Train ACC: 83.75 | Validation ACC: 88.60 Time elapsed: 2.82 min Epoch: 042/050 | Batch 000/461 | Cost: 0.5714 Epoch: 042/050 | Batch 150/461 | Cost: 0.3863 Epoch: 042/050 | Batch 300/461 | Cost: 0.5142 Epoch: 042/050 | Batch 450/461 | Cost: 0.6219 Epoch: 042/050 Train ACC: 83.79 | Validation ACC: 89.20 Time elapsed: 2.89 min Epoch: 043/050 | Batch 000/461 | Cost: 0.5385 Epoch: 043/050 | Batch 150/461 | Cost: 0.4801 Epoch: 043/050 | Batch 300/461 | Cost: 0.6064 Epoch: 043/050 | Batch 450/461 | Cost: 0.4959 Epoch: 043/050 Train ACC: 83.89 | Validation ACC: 88.80 Time elapsed: 2.96 min Epoch: 044/050 | Batch 000/461 | Cost: 0.6742 Epoch: 044/050 | Batch 150/461 | Cost: 0.5746 Epoch: 044/050 | Batch 300/461 | Cost: 0.6846 Epoch: 044/050 | Batch 450/461 | Cost: 0.6283 Epoch: 044/050 Train ACC: 83.91 | Validation ACC: 89.00 Time elapsed: 3.03 min Epoch: 045/050 | Batch 000/461 | Cost: 0.5646 Epoch: 045/050 | Batch 150/461 | Cost: 0.3776 Epoch: 045/050 | Batch 300/461 | Cost: 0.5457 Epoch: 045/050 | Batch 450/461 | Cost: 0.4897 Epoch: 045/050 Train ACC: 83.87 | Validation ACC: 89.10 Time elapsed: 3.10 min Epoch: 046/050 | Batch 000/461 | Cost: 0.5300 Epoch: 046/050 | Batch 150/461 | Cost: 0.6787 Epoch: 046/050 | Batch 300/461 | Cost: 0.4310 Epoch: 046/050 | Batch 450/461 | Cost: 0.5758 Epoch: 046/050 Train ACC: 84.01 | Validation ACC: 89.10 Time elapsed: 3.17 min Epoch: 047/050 | Batch 000/461 | Cost: 0.6111 Epoch: 047/050 | Batch 150/461 | Cost: 0.5679 Epoch: 047/050 | Batch 300/461 | Cost: 0.6306 Epoch: 047/050 | Batch 450/461 | Cost: 0.7292 Epoch: 047/050 Train ACC: 84.03 | Validation ACC: 89.20 Time elapsed: 3.24 min Epoch: 048/050 | Batch 000/461 | Cost: 0.5925 Epoch: 048/050 | Batch 150/461 | Cost: 0.6623 Epoch: 048/050 | Batch 300/461 | Cost: 0.4188 Epoch: 048/050 | Batch 450/461 | Cost: 0.3433 Epoch: 048/050 Train ACC: 83.89 | Validation ACC: 89.10 Time elapsed: 3.31 min Epoch: 049/050 | Batch 000/461 | Cost: 0.4881 Epoch: 049/050 | Batch 150/461 | Cost: 0.5040 Epoch: 049/050 | Batch 300/461 | Cost: 0.5655 Epoch: 049/050 | Batch 450/461 | Cost: 0.5264 Epoch: 049/050 Train ACC: 83.83 | Validation ACC: 88.60 Time elapsed: 3.38 min Epoch: 050/050 | Batch 000/461 | Cost: 0.5284 Epoch: 050/050 | Batch 150/461 | Cost: 0.6253 Epoch: 050/050 | Batch 300/461 | Cost: 0.3891 Epoch: 050/050 | Batch 450/461 | Cost: 0.4316 Epoch: 050/050 Train ACC: 83.90 | Validation ACC: 88.70 Time elapsed: 3.45 min Total Training Time: 3.45 min
plt.plot(cost_list, label='Minibatch cost')
plt.plot(np.convolve(cost_list,
np.ones(200,)/200, mode='valid'),
label='Running average')
plt.ylabel('Cross Entropy')
plt.xlabel('Iteration')
plt.legend()
plt.show()
plt.plot(np.arange(1, NUM_EPOCHS+1), train_acc_list, label='Training')
plt.plot(np.arange(1, NUM_EPOCHS+1), valid_acc_list, label='Validation')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
with torch.set_grad_enabled(False):
test_acc = compute_acc(model=model,
data_loader=test_loader,
device=DEVICE)
valid_acc = compute_acc(model=model,
data_loader=valid_loader,
device=DEVICE)
print(f'Validation ACC: {valid_acc:.2f}%')
print(f'Test ACC: {test_acc:.2f}%')
Validation ACC: 88.70% Test ACC: 84.55%
%watermark -iv
numpy 1.16.4 torch 1.2.0 matplotlib 3.1.0 torchvision 0.4.0a0+6b959ee