Chapter 5 - Image Classification¶

Deep Learning For Coders with fastai & Pytorch - Image Classification, In this notebook I followed both Jeremy Howard's Lesson on fast.ai and Weigh and Biases reading group videos. Lots of notes added, some cell's order changed some are added to make the topic more understandable for me. (Check Manual calculation log_softmax + nll_loss). Click open in colab button at the right side to view as notebook.

toc: true
badges: true
comments: true
categories: [fastbook]
image: images/cyberman.png

I'm a Doctor Who fan and this is my cyberman coffee cup, as I remember got it from Manchester Science Museum.

In [1]:

#!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()
%config Completer.use_jedi = False

In [2]:

from fastbook import *

[[chapter_pet_breeds]]

PLAYING WITH THE DATASET¶

In [3]:

from fastai.vision.all import *
path = untar_data(URLs.PETS)

Note: With untar we download the data. This data originally come from Oxford University Visual Geomety Group and our dataset is here:

In [4]:

path

Out[4]:

Path('/home/niyazi/.fastai/data/oxford-iiit-pet')

Note: This is the local download path for my computer.

In [5]:

Path.BASE_PATH = path

Tip: This is a trick to get the relative path, check above and below

In [6]:

path

Out[6]:

Path('.')

Now the path is looks different.

In [7]:

path.ls()

Out[7]:

(#2) [Path('annotations'),Path('images')]

Note: #2 is number of item in the list. annotations represents target variables of this datasets but we do not use them at this time instead we create our own labels.

In [8]:

(path/"images").ls()

Out[8]:

(#7393) [Path('images/staffordshire_bull_terrier_90.jpg'),Path('images/Russian_Blue_70.jpg'),Path('images/japanese_chin_69.jpg'),Path('images/Maine_Coon_266.jpg'),Path('images/japanese_chin_200.jpg'),Path('images/Siamese_57.jpg'),Path('images/Persian_175.jpg'),Path('images/havanese_81.jpg'),Path('images/Birman_72.jpg'),Path('images/leonberger_55.jpg')...]

In [9]:

fname = (path/"images").ls()[0]

In [10]:

fname

Out[10]:

Path('images/staffordshire_bull_terrier_90.jpg')

Note: The first image in the path list.

In [11]:

re.findall(r'(.+)_\d+.jpg$', fname.name)

Out[11]:

['staffordshire_bull_terrier']

Note: Since we don't use the annonations in the Dataset we need to find a way to get breeds form the filename. This is regex findall method, Check geeksforgeeks.org tutorial here

In [12]:

pets = DataBlock(blocks = (ImageBlock, CategoryBlock),
                 get_items=get_image_files, 
                 splitter=RandomSplitter(seed=42),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'),
                 item_tfms=Resize(460),
                 batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls = pets.dataloaders(path/"images")

/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/_tensor.py:1023: UserWarning: torch.solve is deprecated in favor of torch.linalg.solveand will be removed in a future PyTorch release.
torch.linalg.solve has its arguments reversed and does not return the LU factorization.
To get the LU factorization see torch.lu, which can be used with torch.lu_solve or torch.lu_unpack.
X = torch.solve(B, A).solution
should be replaced with
X = torch.linalg.solve(A, B) (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/BatchLinearAlgebra.cpp:760.)
  ret = func(*args, **kwargs)

Note: now find all names with RegexLabeller. The item_tmsf and batch_transfdrms may look a bit meaningless. Check below to find out why.

PRESIZING¶

As a summary FastAi gives a chance to augment our images in a smarter way (presizing) such that provide much more detail and information for the training. First, we presize images with item_tfms then push them to GPU and use augmentation.

check the original document for the whole idea

In [13]:

#id interpolations
#caption A comparison of fastai's data augmentation strategy (left) and the traditional approach (right).
dblock1 = DataBlock(blocks=(ImageBlock(), CategoryBlock()),
                   get_y=parent_label,
                   item_tfms=Resize(460))
# Place an image in the 'images/grizzly.jpg' subfolder where this notebook is located before running this
dls1 = dblock1.dataloaders([(Path.cwd()/'images'/'chapter-05'/'grizzly.jpg')]*100, bs=8)
dls1.train.get_idxs = lambda: Inf.ones
x,y = dls1.valid.one_batch()
_,axs = subplots(1, 2)

x1 = TensorImage(x.clone())
x1 = x1.affine_coord(sz=224)
x1 = x1.rotate(draw=30, p=1.)
x1 = x1.zoom(draw=1.2, p=1.)
x1 = x1.warp(draw_x=-0.2, draw_y=0.2, p=1.)

tfms = setup_aug_tfms([Rotate(draw=30, p=1, size=224), Zoom(draw=1.2, p=1., size=224),
                       Warp(draw_x=-0.2, draw_y=0.2, p=1., size=224)])
x = Pipeline(tfms)(x)
#x.affine_coord(coord_tfm=coord_tfm, sz=size, mode=mode, pad_mode=pad_mode)
TensorImage(x[0]).show(ctx=axs[0])
TensorImage(x1[0]).show(ctx=axs[1]);

In [14]:

dls.show_batch(nrows=3, ncols=3)

In [15]:

pets1 = DataBlock(blocks = (ImageBlock, CategoryBlock),
                 get_items=get_image_files, 
                 splitter=RandomSplitter(seed=42),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
pets1.summary(path/"images")

Setting-up type transforms pipelines
Collecting items from /home/niyazi/.fastai/data/oxford-iiit-pet/images
Found 7390 items
2 datasets of sizes 5912,1478
Setting up Pipeline: PILBase.create
Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}

Building one sample
  Pipeline: PILBase.create
    starting from
      /home/niyazi/.fastai/data/oxford-iiit-pet/images/British_Shorthair_110.jpg
    applying PILBase.create gives
      PILImage mode=RGB size=500x333
  Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
    starting from
      /home/niyazi/.fastai/data/oxford-iiit-pet/images/British_Shorthair_110.jpg
    applying partial gives
      British_Shorthair
    applying Categorize -- {'vocab': None, 'sort': True, 'add_na': False} gives
      TensorCategory(4)

Final sample: (PILImage mode=RGB size=500x333, TensorCategory(4))


Collecting items from /home/niyazi/.fastai/data/oxford-iiit-pet/images
Found 7390 items
2 datasets of sizes 5912,1478
Setting up Pipeline: PILBase.create
Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
Setting up after_item: Pipeline: ToTensor
Setting up before_batch: Pipeline: 
Setting up after_batch: Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1}

Building one batch
Applying item_tfms to the first sample:
  Pipeline: ToTensor
    starting from
      (PILImage mode=RGB size=500x333, TensorCategory(4))
    applying ToTensor gives
      (TensorImage of size 3x333x500, TensorCategory(4))

Adding the next 3 samples

No before_batch transform to apply

Collating items in a batch
Error! It's not possible to collate your items in a batch
Could not collate the 0-th members of your tuples because got the following shapes
torch.Size([3, 333, 500]),torch.Size([3, 500, 396]),torch.Size([3, 375, 500]),torch.Size([3, 500, 281])

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-15-ead0dd2a047d> in <module>
      3                  splitter=RandomSplitter(seed=42),
      4                  get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
----> 5 pets1.summary(path/"images")

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/block.py in summary(self, source, bs, show_batch, **kwargs)
    188         why = _find_fail_collate(s)
    189         print("Make sure all parts of your samples are tensors of the same size" if why is None else why)
--> 190         raise e
    191 
    192     if len([f for f in dls.train.after_batch.fs if f.name != 'noop'])!=0:

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/block.py in summary(self, source, bs, show_batch, **kwargs)
    182     print("\nCollating items in a batch")
    183     try:
--> 184         b = dls.train.create_batch(s)
    185         b = retain_types(b, s[0] if is_listy(s) else s)
    186     except Exception as e:

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in create_batch(self, b)
    141         elif s is None:  return next(self.it)
    142         else: raise IndexError("Cannot index an iterable dataset numerically - must use `None`.")
--> 143     def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b)
    144     def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)
    145     def to(self, device): self.device = device

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in fa_collate(t)
     48     b = t[0]
     49     return (default_collate(t) if isinstance(b, _collate_types)
---> 50             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     51             else default_collate(t))
     52 

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in <listcomp>(.0)
     48     b = t[0]
     49     return (default_collate(t) if isinstance(b, _collate_types)
---> 50             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     51             else default_collate(t))
     52 

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in fa_collate(t)
     47     "A replacement for PyTorch `default_collate` which maintains types and handles `Sequence`s"
     48     b = t[0]
---> 49     return (default_collate(t) if isinstance(b, _collate_types)
     50             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     51             else default_collate(t))

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     54             storage = elem.storage()._new_shared(numel)
     55             out = elem.new(storage)
---> 56         return torch.stack(batch, 0, out=out)
     57     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
     58             and elem_type.__name__ != 'string_':

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/torch_core.py in __torch_function__(self, func, types, args, kwargs)
    338         convert=False
    339         if _torch_handled(args, self._opt, func): convert,types = type(self),(torch.Tensor,)
--> 340         res = super().__torch_function__(func, types, args=args, kwargs=kwargs)
    341         if convert: res = convert(res)
    342         if isinstance(res, TensorBase): res.set_meta(self, as_copy=True)

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/_tensor.py in __torch_function__(cls, func, types, args, kwargs)
   1021 
   1022         with _C.DisableTorchFunction():
-> 1023             ret = func(*args, **kwargs)
   1024             return _convert(ret, cls)
   1025 

RuntimeError: stack expects each tensor to be equal size, but got [3, 333, 500] at entry 0 and [3, 500, 396] at entry 1

Note: It is alway good to get a quick summary. pets1.summary(path/"images") Check the summary above, it has lots of details. It is natural to get an error in this example because we are trying the put diffent size images into the same DataBlock.

BASELINE MODEL¶

For every project, just start with a Baseline. Baseline is a good point to think about the project/domain/problem at the same time, then start improve and make experiments about architecture, hyperparameters etc.

In [16]:

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2)

/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448278899/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

epoch	train_loss	valid_loss	error_rate	time
0	1.513288	0.355303	0.110284	00:22

epoch	train_loss	valid_loss	error_rate	time
0	0.518711	0.313168	0.106225	00:27
1	0.325613	0.261644	0.089310	00:27

Note: A basic run is helpful as baseline for the beginning.

Defaults for the baseline¶

In [17]:

learn.loss_func

Out[17]:

FlattenedLoss of CrossEntropyLoss()

In [18]:

learn.lr

Out[18]:

0.001

Tip: Very easy to see default arguments for the learner. Above loss function loss_func and learning rate lr.

One Batch Run¶

In [19]:

first(dls.train)

Out[19]:

(TensorImage([[[[ 7.7591e-02, -1.3409e-01,  1.4352e-01,  ..., -8.8188e-01, -8.0163e-01, -1.4735e-01],
           [ 1.9115e-03,  4.8835e-01,  4.3845e-01,  ..., -1.3028e+00, -1.4314e+00, -1.2478e+00],
           [-1.2349e-01,  7.3246e-02, -9.2777e-02,  ..., -7.9699e-01, -1.1984e+00, -9.0709e-02],
           ...,
           [-1.4486e+00, -9.5970e-01,  8.6840e-02,  ..., -1.1097e+00, -3.3829e-01,  8.2527e-02],
           [-1.4246e+00, -8.2784e-01,  8.7511e-02,  ..., -9.5360e-01, -1.0563e-01, -5.1489e-01],
           [-1.3575e+00, -7.6923e-01,  1.0015e-01,  ..., -1.0628e+00,  4.3092e-02, -6.2399e-01]],
 
          [[ 2.5566e-01,  7.5052e-02,  2.0962e-01,  ..., -9.7342e-01, -8.9785e-01, -1.5707e-01],
           [ 8.3578e-02,  6.1146e-01,  5.1947e-01,  ..., -1.3980e+00, -1.5514e+00, -1.3726e+00],
           [-1.2059e-02,  1.2505e-01, -2.9267e-03,  ..., -9.0869e-01, -1.3052e+00, -2.3089e-01],
           ...,
           [-1.4979e+00, -1.1395e+00, -2.8139e-01,  ..., -1.3591e+00, -4.8733e-01, -2.1415e-01],
           [-1.4548e+00, -9.8541e-01, -2.7210e-01,  ..., -1.1278e+00, -3.0796e-01, -8.4852e-01],
           [-1.3689e+00, -9.2548e-01, -2.6808e-01,  ..., -1.2366e+00, -6.3006e-02, -1.0183e+00]],
 
          [[-1.1168e+00, -1.2721e+00, -1.0968e+00,  ..., -1.1363e+00, -9.8121e-01, -3.4084e-01],
           [-1.0031e+00, -6.8494e-01, -8.5066e-01,  ..., -1.5088e+00, -1.6080e+00, -1.4639e+00],
           [-1.1476e+00, -1.0927e+00, -1.3264e+00,  ..., -1.0406e+00, -1.3088e+00, -3.4494e-01],
           ...,
           [-1.4021e+00, -9.7390e-01, -4.7906e-01,  ..., -1.4878e+00, -5.0896e-01, -3.1871e-01],
           [-1.3213e+00, -8.4023e-01, -5.3294e-01,  ..., -1.3262e+00, -5.3787e-01, -1.0765e+00],
           [-1.1781e+00, -8.0876e-01, -5.8936e-01,  ..., -1.3399e+00, -4.2362e-01, -1.1124e+00]]],
 
 
         [[[ 1.9623e+00,  2.0361e+00,  1.9064e+00,  ...,  2.2392e+00,  2.2249e+00,  2.2211e+00],
           [ 2.0734e+00,  2.0294e+00,  2.1349e+00,  ...,  2.2461e+00,  2.2249e+00,  2.2376e+00],
           [ 2.0202e+00,  1.9569e+00,  1.8405e+00,  ...,  2.2373e+00,  2.2353e+00,  2.2223e+00],
           ...,
           [ 3.5436e-01,  2.5449e-01,  5.5067e-01,  ...,  1.0332e+00,  1.0161e+00,  9.8812e-01],
           [ 3.5005e-01,  1.6332e-01,  3.8754e-01,  ...,  9.7724e-01,  9.6458e-01,  1.0630e+00],
           [ 3.4791e-01,  8.2361e-02,  2.2118e-01,  ...,  8.6120e-01,  1.0850e+00,  1.1228e+00]],
 
          [[ 2.1661e+00,  2.2408e+00,  2.0968e+00,  ...,  1.6825e+00,  1.6601e+00,  1.6328e+00],
           [ 2.2785e+00,  2.2232e+00,  2.3262e+00,  ...,  1.6905e+00,  1.6537e+00,  1.6441e+00],
           [ 2.2230e+00,  2.1458e+00,  2.0269e+00,  ...,  1.6834e+00,  1.6742e+00,  1.6310e+00],
           ...,
           [ 8.0271e-01,  7.3445e-01,  1.0265e+00,  ...,  8.7042e-01,  8.4557e-01,  8.8788e-01],
           [ 8.1180e-01,  6.5414e-01,  8.8815e-01,  ...,  8.1193e-01,  7.7473e-01,  9.1055e-01],
           [ 8.1687e-01,  5.4252e-01,  6.8486e-01,  ...,  6.5186e-01,  9.1647e-01,  9.5377e-01]],
 
          [[ 2.3636e+00,  2.4168e+00,  2.2347e+00,  ...,  1.7282e+00,  1.6718e+00,  1.6554e+00],
           [ 2.4760e+00,  2.3875e+00,  2.4613e+00,  ...,  1.7363e+00,  1.6699e+00,  1.6726e+00],
           [ 2.4182e+00,  2.2941e+00,  2.1473e+00,  ...,  1.7294e+00,  1.6948e+00,  1.6592e+00],
           ...,
           [ 1.4156e+00,  1.3690e+00,  1.6562e+00,  ...,  1.2044e+00,  1.2088e+00,  1.2487e+00],
           [ 1.4260e+00,  1.3050e+00,  1.5455e+00,  ...,  1.0673e+00,  1.0365e+00,  1.1483e+00],
           [ 1.4369e+00,  1.2059e+00,  1.3302e+00,  ...,  7.4460e-01,  9.8735e-01,  9.8728e-01]]],
 
 
         [[[ 7.9667e-01,  6.5725e-01,  6.7499e-01,  ...,  2.2489e+00,  2.2489e+00,  2.2489e+00],
           [ 1.6647e+00,  1.8548e+00,  4.2411e-01,  ...,  2.2489e+00,  2.2489e+00,  2.2489e+00],
           [ 2.0417e+00,  2.1499e+00,  1.9243e+00,  ...,  2.2489e+00,  2.2489e+00,  2.2489e+00],
           ...,
           [-7.8885e-02, -8.4444e-02, -1.8854e-01,  ..., -3.3191e-02,  1.6326e-01, -2.5189e-02],
           [-3.9591e-02, -3.7761e-02, -3.5708e-02,  ...,  4.1777e-01,  3.0722e-01, -8.4517e-02],
           [-5.5125e-01, -3.7390e-01, -3.7190e-01,  ...,  1.6706e-01, -3.8756e-02, -3.0213e-01]],
 
          [[ 3.3600e-01,  1.2690e-01,  8.4595e-02,  ...,  2.4286e+00,  2.4286e+00,  2.4286e+00],
           [ 1.3936e+00,  1.5396e+00, -7.8121e-02,  ...,  2.4286e+00,  2.4286e+00,  2.4286e+00],
           [ 1.7230e+00,  1.8204e+00,  1.5281e+00,  ...,  2.4286e+00,  2.4286e+00,  2.4286e+00],
           ...,
           [-2.6621e-01, -3.4865e-01, -5.4389e-01,  ...,  1.5566e-02,  3.6483e-01,  3.7018e-01],
           [-2.3416e-01, -2.9848e-01, -3.8383e-01,  ...,  4.3211e-01,  5.4771e-01,  3.7147e-01],
           [-7.7599e-01, -6.7812e-01, -7.3404e-01,  ...,  2.9308e-01,  2.0118e-01,  3.7493e-02]],
 
          [[-6.9486e-02, -3.3152e-01, -5.6258e-01,  ...,  2.6400e+00,  2.6400e+00,  2.6400e+00],
           [ 9.0693e-01,  9.7337e-01, -5.6124e-01,  ...,  2.6400e+00,  2.6400e+00,  2.6400e+00],
           [ 1.2463e+00,  1.1590e+00,  8.0907e-01,  ...,  2.6400e+00,  2.6400e+00,  2.6400e+00],
           ...,
           [-3.1419e-01, -2.4941e-01, -4.5623e-01,  ..., -6.5955e-01, -6.0038e-01, -8.8913e-01],
           [-2.6903e-01, -2.5050e-01, -3.9344e-01,  ..., -3.7691e-01, -6.0662e-01, -9.9883e-01],
           [-6.3179e-01, -4.3123e-01, -5.1774e-01,  ..., -7.1518e-01, -8.3215e-01, -9.5885e-01]]],
 
 
         ...,
 
 
         [[[ 2.6701e-03,  4.8764e-02,  1.3802e-01,  ..., -3.5556e-01, -2.1186e-01, -6.3790e-02],
           [ 2.7203e-01,  2.9067e-01,  3.0956e-01,  ..., -9.4003e-02, -5.8179e-02, -7.6002e-02],
           [ 3.5114e-01,  3.3277e-01,  3.2004e-01,  ...,  2.0249e-02, -2.6842e-02, -4.4070e-02],
           ...,
           [ 1.9681e+00,  2.0169e+00,  2.0680e+00,  ..., -2.0286e-01,  1.0193e-01,  3.1608e-01],
           [ 1.9411e+00,  2.0085e+00,  2.1026e+00,  ..., -1.3970e-01,  1.2286e-01,  3.5735e-01],
           [ 1.8141e+00,  1.8327e+00,  1.9489e+00,  ..., -1.0404e-01,  1.8111e-01,  3.2454e-01]],
 
          [[ 2.0398e-01,  2.9756e-01,  3.7903e-01,  ..., -1.5909e-02,  4.7189e-02,  1.5181e-01],
           [ 5.4995e-01,  5.8114e-01,  6.0668e-01,  ...,  2.2921e-02,  2.9592e-02,  1.2454e-01],
           [ 6.2037e-01,  6.1136e-01,  6.1487e-01,  ...,  1.3991e-01,  7.2302e-02,  1.2691e-01],
           ...,
           [ 2.1161e+00,  2.1586e+00,  2.1592e+00,  ...,  1.0077e-01,  4.3020e-01,  5.8235e-01],
           [ 2.0806e+00,  2.1535e+00,  2.2194e+00,  ...,  1.1844e-01,  4.4620e-01,  5.9031e-01],
           [ 1.9537e+00,  1.9817e+00,  2.1045e+00,  ...,  1.3738e-01,  4.2917e-01,  6.0165e-01]],
 
          [[-3.0177e-02, -3.0919e-02,  5.6294e-02,  ...,  1.2119e-02,  2.9192e-01,  4.9523e-01],
           [ 1.4675e-01,  1.8120e-01,  2.0599e-01,  ...,  6.5189e-02,  2.1124e-01,  4.7340e-01],
           [ 2.2902e-01,  2.3191e-01,  2.1012e-01,  ...,  1.2057e-01,  1.4622e-01,  3.3338e-01],
           ...,
           [ 2.3455e+00,  2.3984e+00,  2.4285e+00,  ..., -2.0567e-01,  8.8979e-02,  1.8777e-01],
           [ 2.3240e+00,  2.3670e+00,  2.4654e+00,  ..., -1.8698e-01,  1.2802e-01,  2.0268e-01],
           [ 2.2811e+00,  2.2660e+00,  2.3926e+00,  ..., -1.4246e-01,  1.2407e-01,  2.1404e-01]]],
 
 
         [[[ 2.1948e+00,  2.1682e+00,  2.1729e+00,  ..., -8.0144e-02, -1.7157e-01, -2.2714e-01],
           [ 2.1642e+00,  2.1482e+00,  2.1615e+00,  ..., -2.8867e-01, -3.3114e-01, -4.2198e-01],
           [ 2.1637e+00,  2.1500e+00,  2.1554e+00,  ..., -5.3515e-01, -4.0755e-01, -3.9795e-01],
           ...,
           [ 1.0268e+00,  1.0389e+00,  1.0086e+00,  ...,  2.1637e+00,  2.1637e+00,  2.1637e+00],
           [ 1.0284e+00,  1.0010e+00,  1.0453e+00,  ...,  2.1637e+00,  2.1637e+00,  2.1637e+00],
           [ 1.0163e+00,  1.0296e+00,  1.0190e+00,  ...,  2.1637e+00,  2.1637e+00,  2.1637e+00]],
 
          [[ 2.3155e+00,  2.2723e+00,  2.2853e+00,  ...,  2.2712e-01,  2.1174e-01,  1.6661e-01],
           [ 2.2455e+00,  2.2432e+00,  2.2569e+00,  ...,  7.0896e-02,  4.8820e-02, -4.3276e-02],
           [ 2.2668e+00,  2.2566e+00,  2.2656e+00,  ..., -1.1109e-01, -5.4757e-02, -8.3705e-02],
           ...,
           [ 1.1080e+00,  1.0955e+00,  1.0469e+00,  ...,  2.2754e+00,  2.2754e+00,  2.2754e+00],
           [ 1.1009e+00,  1.0698e+00,  1.0872e+00,  ...,  2.2754e+00,  2.2754e+00,  2.2754e+00],
           [ 1.0910e+00,  1.1021e+00,  1.0618e+00,  ...,  2.2754e+00,  2.2754e+00,  2.2754e+00]],
 
          [[ 2.5272e+00,  2.4830e+00,  2.4523e+00,  ...,  8.5871e-01,  8.1094e-01,  7.8095e-01],
           [ 2.4560e+00,  2.4281e+00,  2.4305e+00,  ...,  5.5803e-01,  4.9335e-01,  3.9940e-01],
           [ 2.4629e+00,  2.4525e+00,  2.4602e+00,  ...,  1.7717e-01,  2.3084e-01,  2.2868e-01],
           ...,
           [ 1.1968e+00,  1.1985e+00,  1.1224e+00,  ...,  2.4712e+00,  2.4712e+00,  2.4712e+00],
           [ 1.2228e+00,  1.1839e+00,  1.1597e+00,  ...,  2.4712e+00,  2.4712e+00,  2.4712e+00],
           [ 1.2212e+00,  1.2520e+00,  1.1560e+00,  ...,  2.4712e+00,  2.4712e+00,  2.4712e+00]]],
 
 
         [[[ 2.2489e+00,  2.2489e+00,  2.2489e+00,  ...,  2.2312e+00,  2.2403e+00,  2.2489e+00],
           [ 2.2489e+00,  2.2489e+00,  2.2489e+00,  ...,  2.2320e+00,  2.2485e+00,  2.2489e+00],
           [ 2.2489e+00,  2.2489e+00,  2.2489e+00,  ...,  2.2291e+00,  2.2471e+00,  2.2489e+00],
           ...,
           [-1.7937e+00, -1.9148e+00, -1.9569e+00,  ...,  7.9287e-01,  6.7453e-01,  7.8103e-01],
           [-1.6935e+00, -1.8518e+00, -1.8703e+00,  ...,  4.8874e-01,  2.1611e-01,  1.1217e-01],
           [-1.6270e+00, -1.8958e+00, -1.8929e+00,  ...,  8.0896e-01,  8.9964e-01,  1.0060e+00]],
 
          [[ 2.4286e+00,  2.4286e+00,  2.4286e+00,  ...,  2.4109e+00,  2.4200e+00,  2.4286e+00],
           [ 2.4286e+00,  2.4286e+00,  2.4286e+00,  ...,  2.4113e+00,  2.4282e+00,  2.4286e+00],
           [ 2.4286e+00,  2.4286e+00,  2.4286e+00,  ...,  2.4083e+00,  2.4268e+00,  2.4286e+00],
           ...,
           [-1.4270e+00, -1.6502e+00, -1.6874e+00,  ...,  7.1551e-01,  4.9632e-01,  6.5896e-01],
           [-1.2436e+00, -1.4754e+00, -1.4859e+00,  ...,  2.8340e-01, -9.5874e-02, -1.2210e-01],
           [-1.1076e+00, -1.4404e+00, -1.4846e+00,  ...,  6.4517e-01,  7.3422e-01,  8.5824e-01]],
 
          [[ 2.6400e+00,  2.6400e+00,  2.6400e+00,  ...,  2.6223e+00,  2.6305e+00,  2.6147e+00],
           [ 2.6400e+00,  2.6400e+00,  2.6400e+00,  ...,  2.6228e+00,  2.6396e+00,  2.6398e+00],
           [ 2.6400e+00,  2.6400e+00,  2.6400e+00,  ...,  2.6198e+00,  2.6382e+00,  2.6400e+00],
           ...,
           [-1.0548e+00, -1.1392e+00, -1.2386e+00,  ...,  6.2691e-01,  3.3431e-01,  4.8703e-01],
           [-8.5461e-01, -8.5948e-01, -9.4681e-01,  ...,  2.0497e-01, -1.5078e-01, -2.6161e-01],
           [-9.2614e-01, -8.7474e-01, -8.2363e-01,  ...,  6.1918e-01,  7.6773e-01,  8.3740e-01]]]], device='cuda:0'),
 TensorCategory([25,  4, 27, 20, 12, 27, 31, 33, 14, 35, 16,  5, 22, 33,  3, 35,  3,  0, 32, 12,  1, 20, 18, 22, 15, 11, 13,  5, 35,  4, 22, 34, 15,  4,  3, 21,  5, 22, 27, 11, 15, 13, 14, 32, 13,  4,  7, 30,
          9, 20,  7, 20,  9,  1,  6, 35, 23,  8, 14, 16, 18,  6,  2, 35], device='cuda:0'))

Note: above and below is same

In [20]:

x,y = dls.one_batch()

Understanding Labels¶

In [21]:

dls.vocab

Out[21]:

['Abyssinian', 'Bengal', 'Birman', 'Bombay', 'British_Shorthair', 'Egyptian_Mau', 'Maine_Coon', 'Persian', 'Ragdoll', 'Russian_Blue', 'Siamese', 'Sphynx', 'american_bulldog', 'american_pit_bull_terrier', 'basset_hound', 'beagle', 'boxer', 'chihuahua', 'english_cocker_spaniel', 'english_setter', 'german_shorthaired', 'great_pyrenees', 'havanese', 'japanese_chin', 'keeshond', 'leonberger', 'miniature_pinscher', 'newfoundland', 'pomeranian', 'pug', 'saint_bernard', 'samoyed', 'scottish_terrier', 'shiba_inu', 'staffordshire_bull_terrier', 'wheaten_terrier', 'yorkshire_terrier']

In [22]:

dls.vocab[0]

Out[22]:

'Abyssinian'

Tip: vocab gives as all labels as text.

What's inside the tensors?¶

In [23]:

Out[23]:

TensorCategory([13, 35,  8, 36,  3, 10, 10, 14, 22,  1,  5,  5,  5,  0,  4,  7, 11, 33, 18, 25, 20,  3, 33,  0, 25, 15, 27,  9, 17, 25, 19, 26,  9,  0, 35,  5,  6,  1, 31, 14,  7,  9,  8, 27,  2,  7, 21, 13,
        26, 17, 25, 30, 31,  5, 19, 17,  4, 12, 29,  8, 21, 33, 18,  9], device='cuda:0')

Note: Targets as coded.

In [24]:

Out[24]:

TensorImage([[[[-1.3790, -1.3778, -1.3984,  ..., -0.1093,  0.1460, -0.0339],
          [-1.3243, -1.3580, -1.3804,  ...,  0.0449,  0.0572, -0.0411],
          [-1.3337, -1.3652, -1.3996,  ..., -0.0918, -0.1107, -0.0857],
          ...,
          [-0.4574, -0.3503, -0.3927,  ..., -0.6010, -0.7011, -0.7119],
          [-0.3509, -0.1960, -0.2069,  ..., -0.6884, -0.6634, -0.6341],
          [-0.3221, -0.3299, -0.3177,  ..., -0.5625, -0.4453, -0.4082]],

         [[-1.6744, -1.6758, -1.6812,  ..., -0.5800, -0.2989, -0.4613],
          [-1.5871, -1.6285, -1.6568,  ..., -0.3977, -0.3745, -0.4682],
          [-1.5626, -1.6162, -1.6632,  ..., -0.5169, -0.5306, -0.5112],
          ...,
          [-1.0813, -0.9612, -0.9992,  ..., -1.1308, -1.2921, -1.3514],
          [-0.9441, -0.7857, -0.7948,  ..., -1.2437, -1.2741, -1.3164],
          [-0.9350, -0.9371, -0.9192,  ..., -1.1914, -1.1397, -1.1095]],

         [[-1.7511, -1.7434, -1.7629,  ..., -0.6031, -0.3360, -0.5057],
          [-1.6791, -1.7414, -1.7652,  ..., -0.4428, -0.4310, -0.5192],
          [-1.6424, -1.7149, -1.7630,  ..., -0.5840, -0.6148, -0.5747],
          ...,
          [-1.6312, -1.4844, -1.5600,  ..., -1.3854, -1.6552, -1.7876],
          [-1.4652, -1.2946, -1.3208,  ..., -1.5283, -1.6561, -1.7149],
          [-1.4120, -1.4189, -1.4288,  ..., -1.5720, -1.5691, -1.5458]]],


        [[[-1.1709, -1.0320, -0.3882,  ..., -2.0315, -2.0706, -2.0406],
          [-0.7207, -1.2576, -0.8119,  ..., -2.0559, -2.0684, -2.0728],
          [-0.2858, -0.7315, -1.1736,  ..., -2.0433, -2.0766, -2.0962],
          ...,
          [-0.6050, -0.6222, -0.7002,  ...,  0.0488,  0.0771,  0.0815],
          [-0.6429, -0.6763, -0.7053,  ...,  0.1518, -0.0409,  0.1402],
          [-0.6518, -0.7125, -0.7378,  ...,  0.1249,  0.0496,  0.1191]],

         [[-0.8777, -0.6010,  0.1436,  ..., -1.6740, -1.8200, -1.7518],
          [-0.3933, -0.8562, -0.2790,  ..., -1.7451, -1.8429, -1.8417],
          [ 0.0344, -0.4096, -0.7499,  ..., -1.8033, -1.8966, -1.8918],
          ...,
          [-0.6012, -0.6491, -0.6763,  ...,  0.2168,  0.2401,  0.2356],
          [-0.6647, -0.7301, -0.6992,  ...,  0.3333,  0.1283,  0.3030],
          [-0.6832, -0.7672, -0.7454,  ...,  0.2804,  0.2115,  0.2686]],

         [[-0.6769, -0.6395,  0.1047,  ..., -1.7038, -1.7277, -1.7018],
          [-0.1480, -0.7290, -0.2868,  ..., -1.7481, -1.7221, -1.7096],
          [ 0.2469, -0.1654, -0.6479,  ..., -1.7517, -1.7447, -1.7343],
          ...,
          [-0.5935, -0.7393, -0.8030,  ...,  0.4358,  0.4557,  0.3989],
          [-0.6801, -0.8271, -0.8223,  ...,  0.5317,  0.3682,  0.4378],
          [-0.7225, -0.8686, -0.8416,  ...,  0.4712,  0.3960,  0.3788]]],


        [[[ 0.4054,  0.4157,  0.4160,  ...,  0.4003,  0.3259,  0.1861],
          [ 0.4487,  0.4643,  0.4645,  ...,  0.3877,  0.2998,  0.1810],
          [ 0.4725,  0.4946,  0.4947,  ...,  0.3752,  0.2639,  0.1651],
          ...,
          [-0.5970, -0.4684, -0.6175,  ...,  1.5827,  1.6680,  1.6609],
          [-0.5992, -0.5694, -0.4757,  ...,  1.2104,  1.3103,  1.3862],
          [-0.6427, -0.7463, -0.7177,  ...,  0.8878,  0.8792,  1.0099]],

         [[ 0.9298,  0.9404,  0.9407,  ...,  0.9246,  0.8562,  0.7873],
          [ 0.9741,  0.9901,  0.9902,  ...,  0.9117,  0.8413,  0.7847],
          [ 0.9984,  1.0209,  1.0210,  ...,  0.8998,  0.8208,  0.7683],
          ...,
          [-0.1210,  0.0165, -0.1429,  ...,  1.7798,  1.8600,  1.8469],
          [-0.1235, -0.0915,  0.0087,  ...,  1.4962,  1.5836,  1.6497],
          [-0.1703, -0.2819, -0.2510,  ...,  1.2888,  1.2676,  1.3861]],

         [[ 1.4446,  1.4550,  1.4553,  ...,  1.4395,  1.4119,  1.4488],
          [ 1.4881,  1.5037,  1.5038,  ...,  1.4270,  1.4177,  1.4495],
          [ 1.5118,  1.5338,  1.5339,  ...,  1.4189,  1.4155,  1.4334],
          ...,
          [ 0.3907,  0.5309,  0.3682,  ...,  2.0347,  2.1100,  2.0935],
          [ 0.3881,  0.4208,  0.5230,  ...,  1.8240,  1.9038,  1.9552],
          [ 0.3403,  0.2258,  0.2576,  ...,  1.6937,  1.6618,  1.7696]]],


        ...,


        [[[-0.7000, -0.6986, -0.7306,  ..., -1.6313, -1.7078, -1.6480],
          [-0.6932, -0.6908, -0.7237,  ..., -1.5230, -1.6776, -1.6388],
          [-0.6817, -0.6618, -0.6940,  ..., -1.2779, -1.5299, -1.5905],
          ...,
          [ 0.7912,  0.9871,  0.9476,  ...,  0.7637,  0.8491,  0.8404],
          [ 0.6435,  0.6536,  0.5389,  ...,  0.2268,  0.2840,  0.7647],
          [ 0.2871,  0.1747,  0.0099,  ...,  0.1704,  0.2587,  0.7739]],

         [[-0.5360, -0.5255, -0.5556,  ..., -1.7426, -1.8394, -1.8137],
          [-0.5545, -0.5179, -0.5420,  ..., -1.6711, -1.7999, -1.8003],
          [-0.5629, -0.4964, -0.5208,  ..., -1.4790, -1.7004, -1.7559],
          ...,
          [ 0.8302,  1.0338,  0.9972,  ...,  0.5573,  0.6361,  0.5920],
          [ 0.6521,  0.6515,  0.5315,  ..., -0.0175,  0.0396,  0.5232],
          [ 0.2739,  0.1468, -0.0415,  ..., -0.0737,  0.0125,  0.5333]],

         [[-0.1682, -0.1622, -0.1925,  ..., -1.7371, -1.7819, -1.7964],
          [-0.1791, -0.1565, -0.1929,  ..., -1.7217, -1.7671, -1.7829],
          [-0.1814, -0.1405, -0.1879,  ..., -1.6271, -1.7128, -1.7439],
          ...,
          [ 0.7712,  1.0543,  1.0645,  ...,  0.3954,  0.4138,  0.3209],
          [ 0.5921,  0.6752,  0.5766,  ..., -0.2356, -0.1997,  0.2561],
          [ 0.2282,  0.1419, -0.0471,  ..., -0.3060, -0.2291,  0.2662]]],


        [[[-1.7039, -1.5743, -0.7309,  ..., -1.4899, -1.5223, -1.7014],
          [-1.6543, -1.3360, -0.7863,  ..., -1.4862, -1.4992, -1.6543],
          [-1.4747, -0.9363, -0.9740,  ..., -1.4786, -1.7038, -1.7715],
          ...,
          [-1.0359, -0.9016, -0.9339,  ...,  1.1125,  1.1213,  0.7437],
          [-0.9960, -1.1363, -1.0869,  ...,  0.8008,  1.0108,  0.9147],
          [-1.1167, -1.2009, -1.1964,  ...,  0.6893,  1.3224,  0.2577]],

         [[-1.5970, -1.4079, -0.4739,  ..., -1.2143, -1.3011, -1.4974],
          [-1.5914, -1.1409, -0.5364,  ..., -1.1936, -1.2596, -1.4214],
          [-1.4410, -0.7377, -0.7863,  ..., -1.1843, -1.4621, -1.5348],
          ...,
          [-0.6342, -0.4864, -0.5238,  ...,  1.4785,  1.5627,  1.1883],
          [-0.6105, -0.7573, -0.7006,  ...,  1.0617,  1.3020,  1.2896],
          [-0.7374, -0.8229, -0.8292,  ...,  0.9343,  1.6271,  0.5592]],

         [[-1.7608, -1.7676, -1.6709,  ..., -1.7209, -1.7915, -1.7875],
          [-1.7147, -1.7388, -1.6955,  ..., -1.7866, -1.7723, -1.7779],
          [-1.6931, -1.5459, -1.6110,  ..., -1.7689, -1.7772, -1.7837],
          ...,
          [-1.5969, -1.4913, -1.5156,  ..., -0.6912, -0.3672, -0.7122],
          [-1.5018, -1.6808, -1.6485,  ..., -0.5699, -0.2657, -0.5272],
          [-1.5775, -1.6876, -1.6868,  ..., -0.0425,  0.4928, -0.8241]]],


        [[[ 1.4877,  1.4668,  1.5048,  ...,  2.0390,  2.0327,  2.0266],
          [ 1.5198,  1.4866,  1.5066,  ...,  2.0343,  2.0305,  2.0289],
          [ 1.5237,  1.4689,  1.5328,  ...,  2.0266,  2.0217,  2.0152],
          ...,
          [ 1.4687,  1.8343,  1.9416,  ..., -1.3055, -1.2876, -1.3210],
          [ 1.8619,  1.9001,  1.8640,  ..., -1.1670, -1.3076, -1.3698],
          [ 1.8915,  1.8637,  1.8954,  ..., -1.1410, -1.3742, -1.3414]],

         [[ 1.8535,  1.8615,  1.8702,  ...,  2.2140,  2.2075,  2.2014],
          [ 1.8948,  1.9005,  1.8962,  ...,  2.2092,  2.2053,  2.2037],
          [ 1.8926,  1.8579,  1.8944,  ...,  2.2014,  2.2056,  2.2128],
          ...,
          [ 1.8888,  2.1500,  2.2057,  ..., -1.2930, -1.2397, -1.3169],
          [ 2.2001,  2.2026,  2.1672,  ..., -1.2065, -1.2288, -1.3877],
          [ 2.1885,  2.1623,  2.2023,  ..., -1.2037, -1.3027, -1.3650]],

         [[ 2.0973,  2.0843,  2.1081,  ...,  2.4264,  2.4200,  2.4138],
          [ 2.1341,  2.1351,  2.1528,  ...,  2.4216,  2.4177,  2.4161],
          [ 2.1146,  2.0894,  2.1310,  ...,  2.4138,  2.4146,  2.4143],
          ...,
          [ 2.3444,  2.4653,  2.4575,  ..., -1.1536, -1.0986, -1.1590],
          [ 2.5056,  2.5002,  2.4774,  ..., -1.0442, -1.1040, -1.1783],
          [ 2.5015,  2.5031,  2.5432,  ..., -1.0341, -1.1608, -1.1347]]]], device='cuda:0')

Note: Our stacked image tensor.

Predictions of the baseline model.¶

In [25]:

preds,_ = learn.get_preds(dl=[(x,y)])
preds[0]

Out[25]:

tensor([1.4670e-06, 1.2070e-06, 8.4748e-07, 1.6964e-07, 7.0972e-06, 9.3213e-07, 1.9146e-06, 4.0787e-07, 1.3208e-06, 1.8394e-06, 1.8446e-08, 2.0282e-05, 9.3669e-04, 9.9753e-01, 4.6090e-06, 4.6171e-05,
        8.3924e-05, 4.4448e-04, 3.7151e-07, 7.7943e-07, 6.8438e-06, 7.1965e-07, 2.7995e-07, 1.9403e-06, 1.0657e-06, 7.8017e-07, 1.8254e-05, 5.4245e-06, 4.5678e-06, 8.7494e-07, 3.8811e-06, 1.2178e-06,
        6.4576e-07, 1.8837e-05, 8.5143e-04, 1.4807e-06, 1.7899e-06])

Note: result for first item that adds up to one. There are 37 outputs for 37 image categories and the results are in percentage for probability of each category.

In [26]:

Out[26]:

TensorCategory([13, 35,  8, 36,  3, 10, 10, 14, 22,  1,  5,  5,  5,  0,  4,  7, 11, 33, 18, 25, 20,  3, 33,  0, 25, 15, 27,  9, 17, 25, 19, 26,  9,  0, 35,  5,  6,  1, 31, 14,  7,  9,  8, 27,  2,  7, 21, 13,
        26, 17, 25, 30, 31,  5, 19, 17,  4, 12, 29,  8, 21, 33, 18,  9])

Note: Category codes

In [27]:

len(preds[0]),preds[0].sum()

Out[27]:

(37, tensor(1.0000))

Prediction for 37 categories that adds up to one.

FUNCTION FOR CLASSIFIYING MORE THAN TWO CATEGORY¶

For classifiying more than two category, we need to employ a new function. It is not totally different than sigmoid, in fact it starts with a sigmoid function.

In [28]:

plot_function(torch.sigmoid, min=-4,max=4)

/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastbook/__init__.py:73: UserWarning: Not providing a value for linspace's steps is deprecated and will throw a runtime error in a future release. This warning will appear only once per process. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/RangeFactories.cpp:25.)
  x = torch.linspace(min,max)

Note: This is how torch.sigmoid squishes values between 0 and 1.

In [29]:

torch.random.manual_seed(42);

In [30]:

acts = torch.randn((6,2))*2
acts

Out[30]:

tensor([[ 0.6734,  0.2576],
        [ 0.4689,  0.4607],
        [-2.2457, -0.3727],
        [ 4.4164, -1.2760],
        [ 0.9233,  0.5347],
        [ 1.0698,  1.6187]])

Note: These are random numbers represent binary results of a hypothetical network. First colums represent 3's the and second is 7's standart deviation of 2. It generally shows how confident the model about the predictions.

In [31]:

acts.sigmoid()

Out[31]:

tensor([[0.6623, 0.5641],
        [0.6151, 0.6132],
        [0.0957, 0.4079],
        [0.9881, 0.2182],
        [0.7157, 0.6306],
        [0.7446, 0.8346]])

Note: If we apply the sigmoid, the result become like this(above). Obviously they aren't adds up to one. These are relative confidence over inputs. For example first row says: it's a three. But what is the probability? It is not clear.

In [32]:

(acts[:,0]-acts[:,1]).sigmoid()

Out[32]:

tensor([0.6025, 0.5021, 0.1332, 0.9966, 0.5959, 0.3661])

Note: If we take the difference between these relative confidence the results become like this above: Now we can say that for the first item, model is 0.6025 (%60.25) confident.

this part is a bit different in the lesson video. so check the video. 1:35:20

In [33]:

sm_acts = torch.softmax(acts, dim=1)
sm_acts

Out[33]:

tensor([[0.6025, 0.3975],
        [0.5021, 0.4979],
        [0.1332, 0.8668],
        [0.9966, 0.0034],
        [0.5959, 0.4041],
        [0.3661, 0.6339]])

Note: torch.softmax does that in one step. Now results for each item adds up to one and identical.

Log Likelihood¶

In [34]:

targ = tensor([0,1,0,1,1,0])

this is our softmax activations:

In [35]:

sm_acts

Out[35]:

tensor([[0.6025, 0.3975],
        [0.5021, 0.4979],
        [0.1332, 0.8668],
        [0.9966, 0.0034],
        [0.5959, 0.4041],
        [0.3661, 0.6339]])

In [36]:

idx = range(6)
sm_acts[idx, targ]

Out[36]:

tensor([0.6025, 0.4979, 0.1332, 0.0034, 0.4041, 0.3661])

Note: Nice trick for getting confidence level for each item.

lets see everything in a table:

In [61]:

from IPython.display import HTML
df = pd.DataFrame(sm_acts, columns=["3","7"])
df['targ'] = targ
df['idx'] = idx
df['loss'] = sm_acts[range(6), targ]
t = df.style.hide_index()
#To have html code compatible with our script
html = t._repr_html_().split('</style>')[1]
html = re.sub(r'<table id="([^"]+)"\s*>', r'<table >', html)
display(HTML(html))

3	7	targ	idx	loss
0.602469	0.397531	0	0	0.602469
0.502065	0.497935	1	1	0.497935
0.133188	0.866811	0	2	0.133188
0.996640	0.003360	1	3	0.003360
0.595949	0.404051	1	4	0.404051
0.366118	0.633882	0	5	0.366118

Warning: I think the last label is wrong here. It must be the confidence instead.

In [62]:

-sm_acts[idx, targ]

Out[62]:

tensor([-0.6025, -0.4979, -0.1332, -0.0034, -0.4041, -0.3661])

Warning: There is a caveat here. These are neg of our confidence level, not loss.

Pytorch way of doing the same here:

In [63]:

F.nll_loss(sm_acts, targ, reduction='none')

Out[63]:

tensor([-0.6025, -0.4979, -0.1332, -0.0034, -0.4041, -0.3661])

Note: Anyway, numbers are still not right, that will be addresses in the Taking the Log section below. The reason is F.nll_loss (negative log likelihood loss) needs arguments such that log is already applied to make the calculation right.(loss)

Taking the Log¶

Note: Directly from the book:

Important: Confusing Name, Beware: The nll in nll_loss stands for "negative log likelihood," but it doesn't actually take the log at all! It assumes you have already taken the log. PyTorch has a function called log_softmax that combines log and softmax in a fast and accurate way. nll_loss is designed to be used after log_softmax.

When we first take the softmax, and then the log likelihood of that, that combination is called cross-entropy loss. In PyTorch, this is available as nn.CrossEntropyLoss (which, in practice, actually does log_softmax and then nll_loss):

pytorch's crossEntropy:

In [64]:

loss_func = nn.CrossEntropyLoss()

In [65]:

loss_func(acts, targ)

Out[65]:

tensor(1.8045)

or:

In [66]:

F.cross_entropy(acts, targ)

Out[66]:

tensor(1.8045)

Note: this is the mean of all losses

and this is all results without taking the mean:

In [68]:

nn.CrossEntropyLoss(reduction='none')(acts, targ)

Out[68]:

tensor([0.5067, 0.6973, 2.0160, 5.6958, 0.9062, 1.0048])

Note:Results above are cross entrophy loss for each image in the list (of course our current numbers are fake numbers)

Manual calculation `log_softmax` + `nll_loss`¶

First log_softmax:

In [69]:

log_sm_acts = torch.log_softmax(acts, dim=1)
log_sm_acts

Out[69]:

tensor([[-5.0672e-01, -9.2248e-01],
        [-6.8903e-01, -6.9729e-01],
        [-2.0160e+00, -1.4293e-01],
        [-3.3658e-03, -5.6958e+00],
        [-5.1760e-01, -9.0621e-01],
        [-1.0048e+00, -4.5589e-01]])

Then negative log likelihood:

In [70]:

F.nll_loss(log_sm_acts, targ, reduction='none')

Out[70]:

tensor([0.5067, 0.6973, 2.0160, 5.6958, 0.9062, 1.0048])

Note: Results are identical

REVISITING THE BASELINE MODEL (Model Interpretation)¶

In [71]:

#width 600
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

In [72]:

interp.most_confused(min_val=5)

Out[72]:

[('american_pit_bull_terrier', 'staffordshire_bull_terrier', 8),
 ('Ragdoll', 'Birman', 7),
 ('Egyptian_Mau', 'Bengal', 5)]

this is our baseline we can start improveing from this point.

IMPROVING THE MODEL¶

Fine Tune¶

Fine tune the model with default arguments:

In [75]:

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1, base_lr=0.1)

epoch	train_loss	valid_loss	error_rate	time
0	2.588707	4.300000	0.445873	00:21

epoch	train_loss	valid_loss	error_rate	time
0	3.385068	2.263443	0.510825	00:26

Note: This is where we overshot. Our loss just increase over second epoch is there a better way to find a learning rate?

Learning Rate Finder¶

In [76]:

learn = cnn_learner(dls, resnet34, metrics=error_rate)

In [77]:

suggested_lr= learn.lr_find()

/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/callback/schedule.py:270: UserWarning: color is redundantly defined by the 'color' keyword argument and the fmt string "ro" (-> color='r'). The keyword argument will take precedence.
  ax.plot(val, idx, 'ro', label=nm, c=color)

Warning: There is a discrepancy between lesson and reading group notebooks. In the book we get two values from the function but in reading group, only one. I thing there was an update for this function that not reflected in the book.

In [78]:

suggested_lr

Out[78]:

SuggestedLRs(valley=tensor(0.0008))

In [79]:

print(f"suggested: {suggested_lr.valley:.2e}")

suggested: 8.32e-04

In [80]:

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2, base_lr=8.32e-04)

epoch	train_loss	valid_loss	error_rate	time
0	2.203637	0.456601	0.139378	00:21

epoch	train_loss	valid_loss	error_rate	time
0	0.631289	0.287444	0.087280	00:26
1	0.423191	0.263927	0.085250	00:26

At this time it decreases steadily

What's under the hood of `fine_tune`¶

When we create a model from a pretrained network fastai automatically freezes all of the pretrained layers for us. When we call the fine_tune method fastai does two things:

Trains the randomly added layers for one epoch, with all other layers frozen
Unfreezes all of the layers, and trains them all for the number of epochs requested

Lets do it manually

In [82]:

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(3, 8.32e-04)

epoch	train_loss	valid_loss	error_rate	time
0	1.806578	0.363257	0.114344	00:21
1	0.697060	0.258624	0.083221	00:22
2	0.449906	0.254586	0.087957	00:21

In [83]:

learn.unfreeze()

Run the lr_find again, because having more layers to train, and weights that have already been trained for three epochs, means our previously found learning rate isn't appropriate any more:

In [84]:

learn.lr_find()

Out[84]:

SuggestedLRs(valley=tensor(0.0001))

Train again with the new lr.

In [85]:

learn.fit_one_cycle(6, lr_max=0.0001)

epoch	train_loss	valid_loss	error_rate	time
0	0.369805	0.265072	0.085250	00:26
1	0.379721	0.352767	0.112314	00:26
2	0.320787	0.257370	0.075778	00:26
3	0.198347	0.217450	0.066306	00:27
4	0.143628	0.217090	0.066306	00:26
5	0.111457	0.216973	0.066306	00:27

So far so good but there is more way to go

Discriminative Learning Rates¶

Basically we use variable learning rate for the model. Bigger rate for the later layers and smaller for early layers.

In [86]:

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(3, 8.32e-04)# first lr
learn.unfreeze()
learn.fit_one_cycle(12, lr_max=slice(0.00005,0.0005))#second lr with a range

epoch	train_loss	valid_loss	error_rate	time
0	1.783345	0.370482	0.119080	00:22
1	0.700986	0.293102	0.096076	00:22
2	0.448751	0.262937	0.093369	00:22

epoch	train_loss	valid_loss	error_rate	time
0	0.390943	0.245929	0.079838	00:28
1	0.356807	0.281976	0.088633	00:27
2	0.344888	0.417350	0.117727	00:27
3	0.267143	0.284152	0.081867	00:27
4	0.217775	0.330306	0.092693	00:28
5	0.172308	0.310047	0.081191	00:27
6	0.122903	0.299161	0.079161	00:27
7	0.099924	0.262270	0.074425	00:27
8	0.059424	0.278250	0.074425	00:27
9	0.045987	0.253283	0.067659	00:27
10	0.036630	0.251685	0.068336	00:27
11	0.034524	0.254469	0.067659	00:27

It is better most of the times.(sometimes I don't get good results, need to arrange the slice values more carefully)

In [88]:

learn.recorder.plot_loss()

Note: Directly from the book:

As you can see, the training loss keeps getting better and better. But notice that eventually the validation loss improvement slows, and sometimes even gets worse! This is the point at which the model is starting to over fit. In particular, the model is becoming overconfident of its predictions. But this does not mean that it is getting less accurate, necessarily. Take a look at the table of training results per epoch, and you will often see that the accuracy continues improving, even as the validation loss gets worse. In the end what matters is your accuracy, or more generally your chosen metrics, not the loss. The loss is just the function we've given the computer to help us to optimize.

Important: I need to think about it how loss increase and accuracy stil becoming better.

Deeper Architectures¶

In general, a bigger model has the ability to better capture the real underlying relationships in your data, and also to capture and memorize the specific details of your individual images. However, using a deeper model is going to require more GPU RAM, so you may need to lower the size of your batches to avoid an out-of-memory error. This happens when you try to fit too much inside your GPU and looks like:

Cuda runtime error: out of memory

You may have to restart your notebook when this happens. The way to solve it is to use a smaller batch size, which means passing smaller groups of images at any given time through your model. You can pass the batch size you want to the call creating your DataLoaders with bs=.

The other downside of deeper architectures is that they take quite a bit longer to train. One technique that can speed things up a lot is mixed-precision training. This refers to using less-precise numbers (half-precision floating point, also called fp16) where possible during training. As we are writing these words in early 2020, nearly all current NVIDIA GPUs support a special feature called tensor cores that can dramatically speed up neural network training, by 2-3x. They also require a lot less GPU memory. To enable this feature in fastai, just add to_fp16() after your Learner creation (you also need to import the module).

You can't really know ahead of time what the best architecture for your particular problem is—you need to try training some. So let's try a ResNet-50 now with mixed precision:

In [91]:

from fastai.callback.fp16 import *
learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()
learn.fine_tune(12, freeze_epochs=3)

epoch	train_loss	valid_loss	error_rate	time
0	1.209030	0.308840	0.097429	00:20
1	0.562807	0.326714	0.100812	00:21
2	0.396488	0.263611	0.089310	00:21

epoch	train_loss	valid_loss	error_rate	time
0	0.255827	0.262954	0.080514	00:24
1	0.215601	0.256829	0.072395	00:24
2	0.238660	0.392900	0.099459	00:23
3	0.246021	0.409503	0.107578	00:24
4	0.196632	0.448040	0.106225	00:23
5	0.137433	0.353745	0.091340	00:23
6	0.108764	0.333932	0.085250	00:24
7	0.078872	0.295772	0.081867	00:24
8	0.055900	0.273311	0.073072	00:24
9	0.040353	0.274645	0.070365	00:24
10	0.020883	0.260611	0.070365	00:24
11	0.021018	0.259633	0.066982	00:24

In [92]:

learn.recorder.plot_loss()

As above traing time is not changed much.