Residual Networks (ResNet)¶

In :
import d2l
from mxnet import gluon, np, npx
from mxnet.gluon import nn
npx.set_np()

Residual block

In :
class Residual(nn.Block):
def __init__(self, num_channels, use_1x1conv=False, strides=1, **kwargs):
super(Residual, self).__init__(**kwargs)
self.conv1 = nn.Conv2D(num_channels, kernel_size=3, padding=1, strides=strides)
self.conv3 = None
if use_1x1conv:
self.conv3 = nn.Conv2D(num_channels, kernel_size=1, strides=strides)
self.bn1 = nn.BatchNorm()
self.bn2 = nn.BatchNorm()

def forward(self, X):
Y = npx.relu(self.bn1(self.conv1(X)))
Y = self.bn2(self.conv2(Y))
if self.conv3:
X = self.conv3(X)
return npx.relu(Y + X)

A situation where the input and output are of the same shape.

In :
blk = Residual(3)
blk.initialize()
X = np.random.uniform(size=(4, 3, 6, 6))
blk(X).shape
Out:
(4, 3, 6, 6)

Halve the output height and width while increasing the number of output channels

In :
blk = Residual(6, use_1x1conv=True, strides=2)
blk.initialize()
blk(X).shape
Out:
(4, 6, 3, 3)

The ResNet block

In :
def resnet_block(num_channels, num_residuals, first_block=False):
blk = nn.Sequential()
for i in range(num_residuals):
if i == 0 and not first_block:
else:
return blk

The model

In :
net = nn.Sequential()
nn.BatchNorm(), nn.Activation('relu'),