Deep Learning For Coders with fastai & Pytorch - Image Classification, In this notebook I followed both Jeremy Howard's Lesson on fast.ai and Weigh and Biases reading group videos. Lots of notes added, some cell's order changed some are added to make the topic more understandable for me. (Check Manual calculation
log_softmax
+nll_loss
). Clickopen in colab
button at the right side to view as notebook.
I'm a Doctor Who fan and this is my cyberman coffee cup, as I remember got it from Manchester Science Museum.
#!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()
%config Completer.use_jedi = False
from fastbook import *
[[chapter_pet_breeds]]
from fastai.vision.all import *
path = untar_data(URLs.PETS)
Note: With
untar
we download the data. This data originally come from Oxford University Visual Geomety Group and our dataset is here:
path
Path('/home/niyazi/.fastai/data/oxford-iiit-pet')
Note: This is the local download path for my computer.
Path.BASE_PATH = path
Tip: This is a trick to get the relative path, check above and below
path
Path('.')
Now the path
is looks different.
path.ls()
(#2) [Path('annotations'),Path('images')]
Note:
#2
is number of item in the list.annotations
represents target variables of this datasets but we do not use them at this time instead we create our own labels.
(path/"images").ls()
(#7393) [Path('images/staffordshire_bull_terrier_90.jpg'),Path('images/Russian_Blue_70.jpg'),Path('images/japanese_chin_69.jpg'),Path('images/Maine_Coon_266.jpg'),Path('images/japanese_chin_200.jpg'),Path('images/Siamese_57.jpg'),Path('images/Persian_175.jpg'),Path('images/havanese_81.jpg'),Path('images/Birman_72.jpg'),Path('images/leonberger_55.jpg')...]
fname = (path/"images").ls()[0]
fname
Path('images/staffordshire_bull_terrier_90.jpg')
Note: The first image in the
path
list.
re.findall(r'(.+)_\d+.jpg$', fname.name)
['staffordshire_bull_terrier']
Note: Since we don't use the annonations in the Dataset we need to find a way to get breeds form the filename. This is regex
findall
method, Checkgeeksforgeeks.org
tutorial here
pets = DataBlock(blocks = (ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(seed=42),
get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'),
item_tfms=Resize(460),
batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls = pets.dataloaders(path/"images")
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/_tensor.py:1023: UserWarning: torch.solve is deprecated in favor of torch.linalg.solveand will be removed in a future PyTorch release. torch.linalg.solve has its arguments reversed and does not return the LU factorization. To get the LU factorization see torch.lu, which can be used with torch.lu_solve or torch.lu_unpack. X = torch.solve(B, A).solution should be replaced with X = torch.linalg.solve(A, B) (Triggered internally at /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/BatchLinearAlgebra.cpp:760.) ret = func(*args, **kwargs)
Note: now find all names with RegexLabeller. The
item_tmsf
andbatch_transfdrms
may look a bit meaningless. Check below to find out why.
As a summary FastAi gives a chance to augment our images in a smarter way (presizing
) such that provide much more detail and information for the training. First, we presize images with item_tfms
then push them to GPU and use augmentation.
#id interpolations
#caption A comparison of fastai's data augmentation strategy (left) and the traditional approach (right).
dblock1 = DataBlock(blocks=(ImageBlock(), CategoryBlock()),
get_y=parent_label,
item_tfms=Resize(460))
# Place an image in the 'images/grizzly.jpg' subfolder where this notebook is located before running this
dls1 = dblock1.dataloaders([(Path.cwd()/'images'/'chapter-05'/'grizzly.jpg')]*100, bs=8)
dls1.train.get_idxs = lambda: Inf.ones
x,y = dls1.valid.one_batch()
_,axs = subplots(1, 2)
x1 = TensorImage(x.clone())
x1 = x1.affine_coord(sz=224)
x1 = x1.rotate(draw=30, p=1.)
x1 = x1.zoom(draw=1.2, p=1.)
x1 = x1.warp(draw_x=-0.2, draw_y=0.2, p=1.)
tfms = setup_aug_tfms([Rotate(draw=30, p=1, size=224), Zoom(draw=1.2, p=1., size=224),
Warp(draw_x=-0.2, draw_y=0.2, p=1., size=224)])
x = Pipeline(tfms)(x)
#x.affine_coord(coord_tfm=coord_tfm, sz=size, mode=mode, pad_mode=pad_mode)
TensorImage(x[0]).show(ctx=axs[0])
TensorImage(x1[0]).show(ctx=axs[1]);
dls.show_batch(nrows=3, ncols=3)
pets1 = DataBlock(blocks = (ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(seed=42),
get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
pets1.summary(path/"images")
Setting-up type transforms pipelines Collecting items from /home/niyazi/.fastai/data/oxford-iiit-pet/images Found 7390 items 2 datasets of sizes 5912,1478 Setting up Pipeline: PILBase.create Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False} Building one sample Pipeline: PILBase.create starting from /home/niyazi/.fastai/data/oxford-iiit-pet/images/British_Shorthair_110.jpg applying PILBase.create gives PILImage mode=RGB size=500x333 Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False} starting from /home/niyazi/.fastai/data/oxford-iiit-pet/images/British_Shorthair_110.jpg applying partial gives British_Shorthair applying Categorize -- {'vocab': None, 'sort': True, 'add_na': False} gives TensorCategory(4) Final sample: (PILImage mode=RGB size=500x333, TensorCategory(4)) Collecting items from /home/niyazi/.fastai/data/oxford-iiit-pet/images Found 7390 items 2 datasets of sizes 5912,1478 Setting up Pipeline: PILBase.create Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False} Setting up after_item: Pipeline: ToTensor Setting up before_batch: Pipeline: Setting up after_batch: Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} Building one batch Applying item_tfms to the first sample: Pipeline: ToTensor starting from (PILImage mode=RGB size=500x333, TensorCategory(4)) applying ToTensor gives (TensorImage of size 3x333x500, TensorCategory(4)) Adding the next 3 samples No before_batch transform to apply Collating items in a batch Error! It's not possible to collate your items in a batch Could not collate the 0-th members of your tuples because got the following shapes torch.Size([3, 333, 500]),torch.Size([3, 500, 396]),torch.Size([3, 375, 500]),torch.Size([3, 500, 281])
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-15-ead0dd2a047d> in <module> 3 splitter=RandomSplitter(seed=42), 4 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name')) ----> 5 pets1.summary(path/"images") ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/block.py in summary(self, source, bs, show_batch, **kwargs) 188 why = _find_fail_collate(s) 189 print("Make sure all parts of your samples are tensors of the same size" if why is None else why) --> 190 raise e 191 192 if len([f for f in dls.train.after_batch.fs if f.name != 'noop'])!=0: ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/block.py in summary(self, source, bs, show_batch, **kwargs) 182 print("\nCollating items in a batch") 183 try: --> 184 b = dls.train.create_batch(s) 185 b = retain_types(b, s[0] if is_listy(s) else s) 186 except Exception as e: ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in create_batch(self, b) 141 elif s is None: return next(self.it) 142 else: raise IndexError("Cannot index an iterable dataset numerically - must use `None`.") --> 143 def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b) 144 def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b) 145 def to(self, device): self.device = device ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in fa_collate(t) 48 b = t[0] 49 return (default_collate(t) if isinstance(b, _collate_types) ---> 50 else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence) 51 else default_collate(t)) 52 ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in <listcomp>(.0) 48 b = t[0] 49 return (default_collate(t) if isinstance(b, _collate_types) ---> 50 else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence) 51 else default_collate(t)) 52 ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in fa_collate(t) 47 "A replacement for PyTorch `default_collate` which maintains types and handles `Sequence`s" 48 b = t[0] ---> 49 return (default_collate(t) if isinstance(b, _collate_types) 50 else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence) 51 else default_collate(t)) ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch) 54 storage = elem.storage()._new_shared(numel) 55 out = elem.new(storage) ---> 56 return torch.stack(batch, 0, out=out) 57 elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \ 58 and elem_type.__name__ != 'string_': ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/torch_core.py in __torch_function__(self, func, types, args, kwargs) 338 convert=False 339 if _torch_handled(args, self._opt, func): convert,types = type(self),(torch.Tensor,) --> 340 res = super().__torch_function__(func, types, args=args, kwargs=kwargs) 341 if convert: res = convert(res) 342 if isinstance(res, TensorBase): res.set_meta(self, as_copy=True) ~/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/_tensor.py in __torch_function__(cls, func, types, args, kwargs) 1021 1022 with _C.DisableTorchFunction(): -> 1023 ret = func(*args, **kwargs) 1024 return _convert(ret, cls) 1025 RuntimeError: stack expects each tensor to be equal size, but got [3, 333, 500] at entry 0 and [3, 500, 396] at entry 1
Note: It is alway good to get a quick summary.
pets1.summary(path/"images")
Check the summary above, it has lots of details. It is natural to get an error in this example because we are trying the put diffent size images into the sameDataBlock
.
For every project, just start with a Baseline. Baseline is a good point to think about the project/domain/problem at the same time, then start improve and make experiments about architecture, hyperparameters etc.
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2)
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /opt/conda/conda-bld/pytorch_1623448278899/work/c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 1.513288 | 0.355303 | 0.110284 | 00:22 |
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 0.518711 | 0.313168 | 0.106225 | 00:27 |
1 | 0.325613 | 0.261644 | 0.089310 | 00:27 |
Note: A basic run is helpful as baseline for the beginning.
learn.loss_func
FlattenedLoss of CrossEntropyLoss()
learn.lr
0.001
Tip: Very easy to see default arguments for the learner. Above loss function
loss_func
and learning ratelr
.
first(dls.train)
(TensorImage([[[[ 7.7591e-02, -1.3409e-01, 1.4352e-01, ..., -8.8188e-01, -8.0163e-01, -1.4735e-01], [ 1.9115e-03, 4.8835e-01, 4.3845e-01, ..., -1.3028e+00, -1.4314e+00, -1.2478e+00], [-1.2349e-01, 7.3246e-02, -9.2777e-02, ..., -7.9699e-01, -1.1984e+00, -9.0709e-02], ..., [-1.4486e+00, -9.5970e-01, 8.6840e-02, ..., -1.1097e+00, -3.3829e-01, 8.2527e-02], [-1.4246e+00, -8.2784e-01, 8.7511e-02, ..., -9.5360e-01, -1.0563e-01, -5.1489e-01], [-1.3575e+00, -7.6923e-01, 1.0015e-01, ..., -1.0628e+00, 4.3092e-02, -6.2399e-01]], [[ 2.5566e-01, 7.5052e-02, 2.0962e-01, ..., -9.7342e-01, -8.9785e-01, -1.5707e-01], [ 8.3578e-02, 6.1146e-01, 5.1947e-01, ..., -1.3980e+00, -1.5514e+00, -1.3726e+00], [-1.2059e-02, 1.2505e-01, -2.9267e-03, ..., -9.0869e-01, -1.3052e+00, -2.3089e-01], ..., [-1.4979e+00, -1.1395e+00, -2.8139e-01, ..., -1.3591e+00, -4.8733e-01, -2.1415e-01], [-1.4548e+00, -9.8541e-01, -2.7210e-01, ..., -1.1278e+00, -3.0796e-01, -8.4852e-01], [-1.3689e+00, -9.2548e-01, -2.6808e-01, ..., -1.2366e+00, -6.3006e-02, -1.0183e+00]], [[-1.1168e+00, -1.2721e+00, -1.0968e+00, ..., -1.1363e+00, -9.8121e-01, -3.4084e-01], [-1.0031e+00, -6.8494e-01, -8.5066e-01, ..., -1.5088e+00, -1.6080e+00, -1.4639e+00], [-1.1476e+00, -1.0927e+00, -1.3264e+00, ..., -1.0406e+00, -1.3088e+00, -3.4494e-01], ..., [-1.4021e+00, -9.7390e-01, -4.7906e-01, ..., -1.4878e+00, -5.0896e-01, -3.1871e-01], [-1.3213e+00, -8.4023e-01, -5.3294e-01, ..., -1.3262e+00, -5.3787e-01, -1.0765e+00], [-1.1781e+00, -8.0876e-01, -5.8936e-01, ..., -1.3399e+00, -4.2362e-01, -1.1124e+00]]], [[[ 1.9623e+00, 2.0361e+00, 1.9064e+00, ..., 2.2392e+00, 2.2249e+00, 2.2211e+00], [ 2.0734e+00, 2.0294e+00, 2.1349e+00, ..., 2.2461e+00, 2.2249e+00, 2.2376e+00], [ 2.0202e+00, 1.9569e+00, 1.8405e+00, ..., 2.2373e+00, 2.2353e+00, 2.2223e+00], ..., [ 3.5436e-01, 2.5449e-01, 5.5067e-01, ..., 1.0332e+00, 1.0161e+00, 9.8812e-01], [ 3.5005e-01, 1.6332e-01, 3.8754e-01, ..., 9.7724e-01, 9.6458e-01, 1.0630e+00], [ 3.4791e-01, 8.2361e-02, 2.2118e-01, ..., 8.6120e-01, 1.0850e+00, 1.1228e+00]], [[ 2.1661e+00, 2.2408e+00, 2.0968e+00, ..., 1.6825e+00, 1.6601e+00, 1.6328e+00], [ 2.2785e+00, 2.2232e+00, 2.3262e+00, ..., 1.6905e+00, 1.6537e+00, 1.6441e+00], [ 2.2230e+00, 2.1458e+00, 2.0269e+00, ..., 1.6834e+00, 1.6742e+00, 1.6310e+00], ..., [ 8.0271e-01, 7.3445e-01, 1.0265e+00, ..., 8.7042e-01, 8.4557e-01, 8.8788e-01], [ 8.1180e-01, 6.5414e-01, 8.8815e-01, ..., 8.1193e-01, 7.7473e-01, 9.1055e-01], [ 8.1687e-01, 5.4252e-01, 6.8486e-01, ..., 6.5186e-01, 9.1647e-01, 9.5377e-01]], [[ 2.3636e+00, 2.4168e+00, 2.2347e+00, ..., 1.7282e+00, 1.6718e+00, 1.6554e+00], [ 2.4760e+00, 2.3875e+00, 2.4613e+00, ..., 1.7363e+00, 1.6699e+00, 1.6726e+00], [ 2.4182e+00, 2.2941e+00, 2.1473e+00, ..., 1.7294e+00, 1.6948e+00, 1.6592e+00], ..., [ 1.4156e+00, 1.3690e+00, 1.6562e+00, ..., 1.2044e+00, 1.2088e+00, 1.2487e+00], [ 1.4260e+00, 1.3050e+00, 1.5455e+00, ..., 1.0673e+00, 1.0365e+00, 1.1483e+00], [ 1.4369e+00, 1.2059e+00, 1.3302e+00, ..., 7.4460e-01, 9.8735e-01, 9.8728e-01]]], [[[ 7.9667e-01, 6.5725e-01, 6.7499e-01, ..., 2.2489e+00, 2.2489e+00, 2.2489e+00], [ 1.6647e+00, 1.8548e+00, 4.2411e-01, ..., 2.2489e+00, 2.2489e+00, 2.2489e+00], [ 2.0417e+00, 2.1499e+00, 1.9243e+00, ..., 2.2489e+00, 2.2489e+00, 2.2489e+00], ..., [-7.8885e-02, -8.4444e-02, -1.8854e-01, ..., -3.3191e-02, 1.6326e-01, -2.5189e-02], [-3.9591e-02, -3.7761e-02, -3.5708e-02, ..., 4.1777e-01, 3.0722e-01, -8.4517e-02], [-5.5125e-01, -3.7390e-01, -3.7190e-01, ..., 1.6706e-01, -3.8756e-02, -3.0213e-01]], [[ 3.3600e-01, 1.2690e-01, 8.4595e-02, ..., 2.4286e+00, 2.4286e+00, 2.4286e+00], [ 1.3936e+00, 1.5396e+00, -7.8121e-02, ..., 2.4286e+00, 2.4286e+00, 2.4286e+00], [ 1.7230e+00, 1.8204e+00, 1.5281e+00, ..., 2.4286e+00, 2.4286e+00, 2.4286e+00], ..., [-2.6621e-01, -3.4865e-01, -5.4389e-01, ..., 1.5566e-02, 3.6483e-01, 3.7018e-01], [-2.3416e-01, -2.9848e-01, -3.8383e-01, ..., 4.3211e-01, 5.4771e-01, 3.7147e-01], [-7.7599e-01, -6.7812e-01, -7.3404e-01, ..., 2.9308e-01, 2.0118e-01, 3.7493e-02]], [[-6.9486e-02, -3.3152e-01, -5.6258e-01, ..., 2.6400e+00, 2.6400e+00, 2.6400e+00], [ 9.0693e-01, 9.7337e-01, -5.6124e-01, ..., 2.6400e+00, 2.6400e+00, 2.6400e+00], [ 1.2463e+00, 1.1590e+00, 8.0907e-01, ..., 2.6400e+00, 2.6400e+00, 2.6400e+00], ..., [-3.1419e-01, -2.4941e-01, -4.5623e-01, ..., -6.5955e-01, -6.0038e-01, -8.8913e-01], [-2.6903e-01, -2.5050e-01, -3.9344e-01, ..., -3.7691e-01, -6.0662e-01, -9.9883e-01], [-6.3179e-01, -4.3123e-01, -5.1774e-01, ..., -7.1518e-01, -8.3215e-01, -9.5885e-01]]], ..., [[[ 2.6701e-03, 4.8764e-02, 1.3802e-01, ..., -3.5556e-01, -2.1186e-01, -6.3790e-02], [ 2.7203e-01, 2.9067e-01, 3.0956e-01, ..., -9.4003e-02, -5.8179e-02, -7.6002e-02], [ 3.5114e-01, 3.3277e-01, 3.2004e-01, ..., 2.0249e-02, -2.6842e-02, -4.4070e-02], ..., [ 1.9681e+00, 2.0169e+00, 2.0680e+00, ..., -2.0286e-01, 1.0193e-01, 3.1608e-01], [ 1.9411e+00, 2.0085e+00, 2.1026e+00, ..., -1.3970e-01, 1.2286e-01, 3.5735e-01], [ 1.8141e+00, 1.8327e+00, 1.9489e+00, ..., -1.0404e-01, 1.8111e-01, 3.2454e-01]], [[ 2.0398e-01, 2.9756e-01, 3.7903e-01, ..., -1.5909e-02, 4.7189e-02, 1.5181e-01], [ 5.4995e-01, 5.8114e-01, 6.0668e-01, ..., 2.2921e-02, 2.9592e-02, 1.2454e-01], [ 6.2037e-01, 6.1136e-01, 6.1487e-01, ..., 1.3991e-01, 7.2302e-02, 1.2691e-01], ..., [ 2.1161e+00, 2.1586e+00, 2.1592e+00, ..., 1.0077e-01, 4.3020e-01, 5.8235e-01], [ 2.0806e+00, 2.1535e+00, 2.2194e+00, ..., 1.1844e-01, 4.4620e-01, 5.9031e-01], [ 1.9537e+00, 1.9817e+00, 2.1045e+00, ..., 1.3738e-01, 4.2917e-01, 6.0165e-01]], [[-3.0177e-02, -3.0919e-02, 5.6294e-02, ..., 1.2119e-02, 2.9192e-01, 4.9523e-01], [ 1.4675e-01, 1.8120e-01, 2.0599e-01, ..., 6.5189e-02, 2.1124e-01, 4.7340e-01], [ 2.2902e-01, 2.3191e-01, 2.1012e-01, ..., 1.2057e-01, 1.4622e-01, 3.3338e-01], ..., [ 2.3455e+00, 2.3984e+00, 2.4285e+00, ..., -2.0567e-01, 8.8979e-02, 1.8777e-01], [ 2.3240e+00, 2.3670e+00, 2.4654e+00, ..., -1.8698e-01, 1.2802e-01, 2.0268e-01], [ 2.2811e+00, 2.2660e+00, 2.3926e+00, ..., -1.4246e-01, 1.2407e-01, 2.1404e-01]]], [[[ 2.1948e+00, 2.1682e+00, 2.1729e+00, ..., -8.0144e-02, -1.7157e-01, -2.2714e-01], [ 2.1642e+00, 2.1482e+00, 2.1615e+00, ..., -2.8867e-01, -3.3114e-01, -4.2198e-01], [ 2.1637e+00, 2.1500e+00, 2.1554e+00, ..., -5.3515e-01, -4.0755e-01, -3.9795e-01], ..., [ 1.0268e+00, 1.0389e+00, 1.0086e+00, ..., 2.1637e+00, 2.1637e+00, 2.1637e+00], [ 1.0284e+00, 1.0010e+00, 1.0453e+00, ..., 2.1637e+00, 2.1637e+00, 2.1637e+00], [ 1.0163e+00, 1.0296e+00, 1.0190e+00, ..., 2.1637e+00, 2.1637e+00, 2.1637e+00]], [[ 2.3155e+00, 2.2723e+00, 2.2853e+00, ..., 2.2712e-01, 2.1174e-01, 1.6661e-01], [ 2.2455e+00, 2.2432e+00, 2.2569e+00, ..., 7.0896e-02, 4.8820e-02, -4.3276e-02], [ 2.2668e+00, 2.2566e+00, 2.2656e+00, ..., -1.1109e-01, -5.4757e-02, -8.3705e-02], ..., [ 1.1080e+00, 1.0955e+00, 1.0469e+00, ..., 2.2754e+00, 2.2754e+00, 2.2754e+00], [ 1.1009e+00, 1.0698e+00, 1.0872e+00, ..., 2.2754e+00, 2.2754e+00, 2.2754e+00], [ 1.0910e+00, 1.1021e+00, 1.0618e+00, ..., 2.2754e+00, 2.2754e+00, 2.2754e+00]], [[ 2.5272e+00, 2.4830e+00, 2.4523e+00, ..., 8.5871e-01, 8.1094e-01, 7.8095e-01], [ 2.4560e+00, 2.4281e+00, 2.4305e+00, ..., 5.5803e-01, 4.9335e-01, 3.9940e-01], [ 2.4629e+00, 2.4525e+00, 2.4602e+00, ..., 1.7717e-01, 2.3084e-01, 2.2868e-01], ..., [ 1.1968e+00, 1.1985e+00, 1.1224e+00, ..., 2.4712e+00, 2.4712e+00, 2.4712e+00], [ 1.2228e+00, 1.1839e+00, 1.1597e+00, ..., 2.4712e+00, 2.4712e+00, 2.4712e+00], [ 1.2212e+00, 1.2520e+00, 1.1560e+00, ..., 2.4712e+00, 2.4712e+00, 2.4712e+00]]], [[[ 2.2489e+00, 2.2489e+00, 2.2489e+00, ..., 2.2312e+00, 2.2403e+00, 2.2489e+00], [ 2.2489e+00, 2.2489e+00, 2.2489e+00, ..., 2.2320e+00, 2.2485e+00, 2.2489e+00], [ 2.2489e+00, 2.2489e+00, 2.2489e+00, ..., 2.2291e+00, 2.2471e+00, 2.2489e+00], ..., [-1.7937e+00, -1.9148e+00, -1.9569e+00, ..., 7.9287e-01, 6.7453e-01, 7.8103e-01], [-1.6935e+00, -1.8518e+00, -1.8703e+00, ..., 4.8874e-01, 2.1611e-01, 1.1217e-01], [-1.6270e+00, -1.8958e+00, -1.8929e+00, ..., 8.0896e-01, 8.9964e-01, 1.0060e+00]], [[ 2.4286e+00, 2.4286e+00, 2.4286e+00, ..., 2.4109e+00, 2.4200e+00, 2.4286e+00], [ 2.4286e+00, 2.4286e+00, 2.4286e+00, ..., 2.4113e+00, 2.4282e+00, 2.4286e+00], [ 2.4286e+00, 2.4286e+00, 2.4286e+00, ..., 2.4083e+00, 2.4268e+00, 2.4286e+00], ..., [-1.4270e+00, -1.6502e+00, -1.6874e+00, ..., 7.1551e-01, 4.9632e-01, 6.5896e-01], [-1.2436e+00, -1.4754e+00, -1.4859e+00, ..., 2.8340e-01, -9.5874e-02, -1.2210e-01], [-1.1076e+00, -1.4404e+00, -1.4846e+00, ..., 6.4517e-01, 7.3422e-01, 8.5824e-01]], [[ 2.6400e+00, 2.6400e+00, 2.6400e+00, ..., 2.6223e+00, 2.6305e+00, 2.6147e+00], [ 2.6400e+00, 2.6400e+00, 2.6400e+00, ..., 2.6228e+00, 2.6396e+00, 2.6398e+00], [ 2.6400e+00, 2.6400e+00, 2.6400e+00, ..., 2.6198e+00, 2.6382e+00, 2.6400e+00], ..., [-1.0548e+00, -1.1392e+00, -1.2386e+00, ..., 6.2691e-01, 3.3431e-01, 4.8703e-01], [-8.5461e-01, -8.5948e-01, -9.4681e-01, ..., 2.0497e-01, -1.5078e-01, -2.6161e-01], [-9.2614e-01, -8.7474e-01, -8.2363e-01, ..., 6.1918e-01, 7.6773e-01, 8.3740e-01]]]], device='cuda:0'), TensorCategory([25, 4, 27, 20, 12, 27, 31, 33, 14, 35, 16, 5, 22, 33, 3, 35, 3, 0, 32, 12, 1, 20, 18, 22, 15, 11, 13, 5, 35, 4, 22, 34, 15, 4, 3, 21, 5, 22, 27, 11, 15, 13, 14, 32, 13, 4, 7, 30, 9, 20, 7, 20, 9, 1, 6, 35, 23, 8, 14, 16, 18, 6, 2, 35], device='cuda:0'))
Note: above and below is same
x,y = dls.one_batch()
dls.vocab
['Abyssinian', 'Bengal', 'Birman', 'Bombay', 'British_Shorthair', 'Egyptian_Mau', 'Maine_Coon', 'Persian', 'Ragdoll', 'Russian_Blue', 'Siamese', 'Sphynx', 'american_bulldog', 'american_pit_bull_terrier', 'basset_hound', 'beagle', 'boxer', 'chihuahua', 'english_cocker_spaniel', 'english_setter', 'german_shorthaired', 'great_pyrenees', 'havanese', 'japanese_chin', 'keeshond', 'leonberger', 'miniature_pinscher', 'newfoundland', 'pomeranian', 'pug', 'saint_bernard', 'samoyed', 'scottish_terrier', 'shiba_inu', 'staffordshire_bull_terrier', 'wheaten_terrier', 'yorkshire_terrier']
dls.vocab[0]
'Abyssinian'
Tip:
vocab
gives as all labels as text.
y
TensorCategory([13, 35, 8, 36, 3, 10, 10, 14, 22, 1, 5, 5, 5, 0, 4, 7, 11, 33, 18, 25, 20, 3, 33, 0, 25, 15, 27, 9, 17, 25, 19, 26, 9, 0, 35, 5, 6, 1, 31, 14, 7, 9, 8, 27, 2, 7, 21, 13, 26, 17, 25, 30, 31, 5, 19, 17, 4, 12, 29, 8, 21, 33, 18, 9], device='cuda:0')
Note: Targets as coded.
x
TensorImage([[[[-1.3790, -1.3778, -1.3984, ..., -0.1093, 0.1460, -0.0339], [-1.3243, -1.3580, -1.3804, ..., 0.0449, 0.0572, -0.0411], [-1.3337, -1.3652, -1.3996, ..., -0.0918, -0.1107, -0.0857], ..., [-0.4574, -0.3503, -0.3927, ..., -0.6010, -0.7011, -0.7119], [-0.3509, -0.1960, -0.2069, ..., -0.6884, -0.6634, -0.6341], [-0.3221, -0.3299, -0.3177, ..., -0.5625, -0.4453, -0.4082]], [[-1.6744, -1.6758, -1.6812, ..., -0.5800, -0.2989, -0.4613], [-1.5871, -1.6285, -1.6568, ..., -0.3977, -0.3745, -0.4682], [-1.5626, -1.6162, -1.6632, ..., -0.5169, -0.5306, -0.5112], ..., [-1.0813, -0.9612, -0.9992, ..., -1.1308, -1.2921, -1.3514], [-0.9441, -0.7857, -0.7948, ..., -1.2437, -1.2741, -1.3164], [-0.9350, -0.9371, -0.9192, ..., -1.1914, -1.1397, -1.1095]], [[-1.7511, -1.7434, -1.7629, ..., -0.6031, -0.3360, -0.5057], [-1.6791, -1.7414, -1.7652, ..., -0.4428, -0.4310, -0.5192], [-1.6424, -1.7149, -1.7630, ..., -0.5840, -0.6148, -0.5747], ..., [-1.6312, -1.4844, -1.5600, ..., -1.3854, -1.6552, -1.7876], [-1.4652, -1.2946, -1.3208, ..., -1.5283, -1.6561, -1.7149], [-1.4120, -1.4189, -1.4288, ..., -1.5720, -1.5691, -1.5458]]], [[[-1.1709, -1.0320, -0.3882, ..., -2.0315, -2.0706, -2.0406], [-0.7207, -1.2576, -0.8119, ..., -2.0559, -2.0684, -2.0728], [-0.2858, -0.7315, -1.1736, ..., -2.0433, -2.0766, -2.0962], ..., [-0.6050, -0.6222, -0.7002, ..., 0.0488, 0.0771, 0.0815], [-0.6429, -0.6763, -0.7053, ..., 0.1518, -0.0409, 0.1402], [-0.6518, -0.7125, -0.7378, ..., 0.1249, 0.0496, 0.1191]], [[-0.8777, -0.6010, 0.1436, ..., -1.6740, -1.8200, -1.7518], [-0.3933, -0.8562, -0.2790, ..., -1.7451, -1.8429, -1.8417], [ 0.0344, -0.4096, -0.7499, ..., -1.8033, -1.8966, -1.8918], ..., [-0.6012, -0.6491, -0.6763, ..., 0.2168, 0.2401, 0.2356], [-0.6647, -0.7301, -0.6992, ..., 0.3333, 0.1283, 0.3030], [-0.6832, -0.7672, -0.7454, ..., 0.2804, 0.2115, 0.2686]], [[-0.6769, -0.6395, 0.1047, ..., -1.7038, -1.7277, -1.7018], [-0.1480, -0.7290, -0.2868, ..., -1.7481, -1.7221, -1.7096], [ 0.2469, -0.1654, -0.6479, ..., -1.7517, -1.7447, -1.7343], ..., [-0.5935, -0.7393, -0.8030, ..., 0.4358, 0.4557, 0.3989], [-0.6801, -0.8271, -0.8223, ..., 0.5317, 0.3682, 0.4378], [-0.7225, -0.8686, -0.8416, ..., 0.4712, 0.3960, 0.3788]]], [[[ 0.4054, 0.4157, 0.4160, ..., 0.4003, 0.3259, 0.1861], [ 0.4487, 0.4643, 0.4645, ..., 0.3877, 0.2998, 0.1810], [ 0.4725, 0.4946, 0.4947, ..., 0.3752, 0.2639, 0.1651], ..., [-0.5970, -0.4684, -0.6175, ..., 1.5827, 1.6680, 1.6609], [-0.5992, -0.5694, -0.4757, ..., 1.2104, 1.3103, 1.3862], [-0.6427, -0.7463, -0.7177, ..., 0.8878, 0.8792, 1.0099]], [[ 0.9298, 0.9404, 0.9407, ..., 0.9246, 0.8562, 0.7873], [ 0.9741, 0.9901, 0.9902, ..., 0.9117, 0.8413, 0.7847], [ 0.9984, 1.0209, 1.0210, ..., 0.8998, 0.8208, 0.7683], ..., [-0.1210, 0.0165, -0.1429, ..., 1.7798, 1.8600, 1.8469], [-0.1235, -0.0915, 0.0087, ..., 1.4962, 1.5836, 1.6497], [-0.1703, -0.2819, -0.2510, ..., 1.2888, 1.2676, 1.3861]], [[ 1.4446, 1.4550, 1.4553, ..., 1.4395, 1.4119, 1.4488], [ 1.4881, 1.5037, 1.5038, ..., 1.4270, 1.4177, 1.4495], [ 1.5118, 1.5338, 1.5339, ..., 1.4189, 1.4155, 1.4334], ..., [ 0.3907, 0.5309, 0.3682, ..., 2.0347, 2.1100, 2.0935], [ 0.3881, 0.4208, 0.5230, ..., 1.8240, 1.9038, 1.9552], [ 0.3403, 0.2258, 0.2576, ..., 1.6937, 1.6618, 1.7696]]], ..., [[[-0.7000, -0.6986, -0.7306, ..., -1.6313, -1.7078, -1.6480], [-0.6932, -0.6908, -0.7237, ..., -1.5230, -1.6776, -1.6388], [-0.6817, -0.6618, -0.6940, ..., -1.2779, -1.5299, -1.5905], ..., [ 0.7912, 0.9871, 0.9476, ..., 0.7637, 0.8491, 0.8404], [ 0.6435, 0.6536, 0.5389, ..., 0.2268, 0.2840, 0.7647], [ 0.2871, 0.1747, 0.0099, ..., 0.1704, 0.2587, 0.7739]], [[-0.5360, -0.5255, -0.5556, ..., -1.7426, -1.8394, -1.8137], [-0.5545, -0.5179, -0.5420, ..., -1.6711, -1.7999, -1.8003], [-0.5629, -0.4964, -0.5208, ..., -1.4790, -1.7004, -1.7559], ..., [ 0.8302, 1.0338, 0.9972, ..., 0.5573, 0.6361, 0.5920], [ 0.6521, 0.6515, 0.5315, ..., -0.0175, 0.0396, 0.5232], [ 0.2739, 0.1468, -0.0415, ..., -0.0737, 0.0125, 0.5333]], [[-0.1682, -0.1622, -0.1925, ..., -1.7371, -1.7819, -1.7964], [-0.1791, -0.1565, -0.1929, ..., -1.7217, -1.7671, -1.7829], [-0.1814, -0.1405, -0.1879, ..., -1.6271, -1.7128, -1.7439], ..., [ 0.7712, 1.0543, 1.0645, ..., 0.3954, 0.4138, 0.3209], [ 0.5921, 0.6752, 0.5766, ..., -0.2356, -0.1997, 0.2561], [ 0.2282, 0.1419, -0.0471, ..., -0.3060, -0.2291, 0.2662]]], [[[-1.7039, -1.5743, -0.7309, ..., -1.4899, -1.5223, -1.7014], [-1.6543, -1.3360, -0.7863, ..., -1.4862, -1.4992, -1.6543], [-1.4747, -0.9363, -0.9740, ..., -1.4786, -1.7038, -1.7715], ..., [-1.0359, -0.9016, -0.9339, ..., 1.1125, 1.1213, 0.7437], [-0.9960, -1.1363, -1.0869, ..., 0.8008, 1.0108, 0.9147], [-1.1167, -1.2009, -1.1964, ..., 0.6893, 1.3224, 0.2577]], [[-1.5970, -1.4079, -0.4739, ..., -1.2143, -1.3011, -1.4974], [-1.5914, -1.1409, -0.5364, ..., -1.1936, -1.2596, -1.4214], [-1.4410, -0.7377, -0.7863, ..., -1.1843, -1.4621, -1.5348], ..., [-0.6342, -0.4864, -0.5238, ..., 1.4785, 1.5627, 1.1883], [-0.6105, -0.7573, -0.7006, ..., 1.0617, 1.3020, 1.2896], [-0.7374, -0.8229, -0.8292, ..., 0.9343, 1.6271, 0.5592]], [[-1.7608, -1.7676, -1.6709, ..., -1.7209, -1.7915, -1.7875], [-1.7147, -1.7388, -1.6955, ..., -1.7866, -1.7723, -1.7779], [-1.6931, -1.5459, -1.6110, ..., -1.7689, -1.7772, -1.7837], ..., [-1.5969, -1.4913, -1.5156, ..., -0.6912, -0.3672, -0.7122], [-1.5018, -1.6808, -1.6485, ..., -0.5699, -0.2657, -0.5272], [-1.5775, -1.6876, -1.6868, ..., -0.0425, 0.4928, -0.8241]]], [[[ 1.4877, 1.4668, 1.5048, ..., 2.0390, 2.0327, 2.0266], [ 1.5198, 1.4866, 1.5066, ..., 2.0343, 2.0305, 2.0289], [ 1.5237, 1.4689, 1.5328, ..., 2.0266, 2.0217, 2.0152], ..., [ 1.4687, 1.8343, 1.9416, ..., -1.3055, -1.2876, -1.3210], [ 1.8619, 1.9001, 1.8640, ..., -1.1670, -1.3076, -1.3698], [ 1.8915, 1.8637, 1.8954, ..., -1.1410, -1.3742, -1.3414]], [[ 1.8535, 1.8615, 1.8702, ..., 2.2140, 2.2075, 2.2014], [ 1.8948, 1.9005, 1.8962, ..., 2.2092, 2.2053, 2.2037], [ 1.8926, 1.8579, 1.8944, ..., 2.2014, 2.2056, 2.2128], ..., [ 1.8888, 2.1500, 2.2057, ..., -1.2930, -1.2397, -1.3169], [ 2.2001, 2.2026, 2.1672, ..., -1.2065, -1.2288, -1.3877], [ 2.1885, 2.1623, 2.2023, ..., -1.2037, -1.3027, -1.3650]], [[ 2.0973, 2.0843, 2.1081, ..., 2.4264, 2.4200, 2.4138], [ 2.1341, 2.1351, 2.1528, ..., 2.4216, 2.4177, 2.4161], [ 2.1146, 2.0894, 2.1310, ..., 2.4138, 2.4146, 2.4143], ..., [ 2.3444, 2.4653, 2.4575, ..., -1.1536, -1.0986, -1.1590], [ 2.5056, 2.5002, 2.4774, ..., -1.0442, -1.1040, -1.1783], [ 2.5015, 2.5031, 2.5432, ..., -1.0341, -1.1608, -1.1347]]]], device='cuda:0')
Note: Our stacked image tensor.
preds,_ = learn.get_preds(dl=[(x,y)])
preds[0]
tensor([1.4670e-06, 1.2070e-06, 8.4748e-07, 1.6964e-07, 7.0972e-06, 9.3213e-07, 1.9146e-06, 4.0787e-07, 1.3208e-06, 1.8394e-06, 1.8446e-08, 2.0282e-05, 9.3669e-04, 9.9753e-01, 4.6090e-06, 4.6171e-05, 8.3924e-05, 4.4448e-04, 3.7151e-07, 7.7943e-07, 6.8438e-06, 7.1965e-07, 2.7995e-07, 1.9403e-06, 1.0657e-06, 7.8017e-07, 1.8254e-05, 5.4245e-06, 4.5678e-06, 8.7494e-07, 3.8811e-06, 1.2178e-06, 6.4576e-07, 1.8837e-05, 8.5143e-04, 1.4807e-06, 1.7899e-06])
Note: result for first item that adds up to one. There are 37 outputs for 37 image categories and the results are in percentage for probability of each category.
_
TensorCategory([13, 35, 8, 36, 3, 10, 10, 14, 22, 1, 5, 5, 5, 0, 4, 7, 11, 33, 18, 25, 20, 3, 33, 0, 25, 15, 27, 9, 17, 25, 19, 26, 9, 0, 35, 5, 6, 1, 31, 14, 7, 9, 8, 27, 2, 7, 21, 13, 26, 17, 25, 30, 31, 5, 19, 17, 4, 12, 29, 8, 21, 33, 18, 9])
Note: Category codes
len(preds[0]),preds[0].sum()
(37, tensor(1.0000))
Prediction for 37 categories that adds up to one.
For classifiying more than two category, we need to employ a new function. It is not totally different than sigmoid, in fact it starts with a sigmoid function.
plot_function(torch.sigmoid, min=-4,max=4)
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastbook/__init__.py:73: UserWarning: Not providing a value for linspace's steps is deprecated and will throw a runtime error in a future release. This warning will appear only once per process. (Triggered internally at /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/RangeFactories.cpp:25.) x = torch.linspace(min,max)
Note: This is how
torch.sigmoid
squishes values between 0 and 1.
torch.random.manual_seed(42);
acts = torch.randn((6,2))*2
acts
tensor([[ 0.6734, 0.2576], [ 0.4689, 0.4607], [-2.2457, -0.3727], [ 4.4164, -1.2760], [ 0.9233, 0.5347], [ 1.0698, 1.6187]])
Note: These are random numbers represent binary results of a hypothetical network. First colums represent 3's the and second is 7's standart deviation of 2. It generally shows how confident the model about the predictions.
acts.sigmoid()
tensor([[0.6623, 0.5641], [0.6151, 0.6132], [0.0957, 0.4079], [0.9881, 0.2182], [0.7157, 0.6306], [0.7446, 0.8346]])
Note: If we apply the sigmoid, the result become like this(above). Obviously they aren't adds up to one. These are relative confidence over inputs. For example first row says: it's a three. But what is the probability? It is not clear.
(acts[:,0]-acts[:,1]).sigmoid()
tensor([0.6025, 0.5021, 0.1332, 0.9966, 0.5959, 0.3661])
Note: If we take the difference between these relative confidence the results become like this above: Now we can say that for the first item, model is 0.6025 (%60.25) confident.
this part is a bit different in the lesson video. so check the video. 1:35:20
sm_acts = torch.softmax(acts, dim=1)
sm_acts
tensor([[0.6025, 0.3975], [0.5021, 0.4979], [0.1332, 0.8668], [0.9966, 0.0034], [0.5959, 0.4041], [0.3661, 0.6339]])
Note:
torch.softmax
does that in one step. Now results for each item adds up to one and identical.
targ = tensor([0,1,0,1,1,0])
this is our softmax activations:
sm_acts
tensor([[0.6025, 0.3975], [0.5021, 0.4979], [0.1332, 0.8668], [0.9966, 0.0034], [0.5959, 0.4041], [0.3661, 0.6339]])
idx = range(6)
sm_acts[idx, targ]
tensor([0.6025, 0.4979, 0.1332, 0.0034, 0.4041, 0.3661])
Note: Nice trick for getting confidence level for each item.
lets see everything in a table:
from IPython.display import HTML
df = pd.DataFrame(sm_acts, columns=["3","7"])
df['targ'] = targ
df['idx'] = idx
df['loss'] = sm_acts[range(6), targ]
t = df.style.hide_index()
#To have html code compatible with our script
html = t._repr_html_().split('</style>')[1]
html = re.sub(r'<table id="([^"]+)"\s*>', r'<table >', html)
display(HTML(html))
3 | 7 | targ | idx | loss |
---|---|---|---|---|
0.602469 | 0.397531 | 0 | 0 | 0.602469 |
0.502065 | 0.497935 | 1 | 1 | 0.497935 |
0.133188 | 0.866811 | 0 | 2 | 0.133188 |
0.996640 | 0.003360 | 1 | 3 | 0.003360 |
0.595949 | 0.404051 | 1 | 4 | 0.404051 |
0.366118 | 0.633882 | 0 | 5 | 0.366118 |
Warning: I think the last label is wrong here. It must be the confidence instead.
-sm_acts[idx, targ]
tensor([-0.6025, -0.4979, -0.1332, -0.0034, -0.4041, -0.3661])
Warning: There is a caveat here. These are neg of our confidence level, not loss.
Pytorch way of doing the same here:
F.nll_loss(sm_acts, targ, reduction='none')
tensor([-0.6025, -0.4979, -0.1332, -0.0034, -0.4041, -0.3661])
Note: Anyway, numbers are still not right, that will be addresses in the
Taking the Log
section below. The reason is F.nll_loss (negative log likelihood loss) needs arguments such that log is already applied to make the calculation right.(loss)
Note: Directly from the book:
Important: Confusing Name, Beware: The nll in
nll_loss
stands for "negative log likelihood," but it doesn't actually take the log at all! It assumes you have already taken the log. PyTorch has a function calledlog_softmax
that combineslog
andsoftmax
in a fast and accurate way.nll_loss
is designed to be used afterlog_softmax
.
When we first take the softmax, and then the log likelihood of that, that combination is called cross-entropy loss. In PyTorch, this is available as nn.CrossEntropyLoss
(which, in practice, actually does log_softmax
and then nll_loss
):
pytorch's crossEntropy:
loss_func = nn.CrossEntropyLoss()
loss_func(acts, targ)
tensor(1.8045)
or:
F.cross_entropy(acts, targ)
tensor(1.8045)
Note: this is the mean of all losses
and this is all results without taking the mean:
nn.CrossEntropyLoss(reduction='none')(acts, targ)
tensor([0.5067, 0.6973, 2.0160, 5.6958, 0.9062, 1.0048])
Note:Results above are cross entrophy loss for each image in the list (of course our current numbers are fake numbers)
log_softmax
+ nll_loss
¶First log_softmax:
log_sm_acts = torch.log_softmax(acts, dim=1)
log_sm_acts
tensor([[-5.0672e-01, -9.2248e-01], [-6.8903e-01, -6.9729e-01], [-2.0160e+00, -1.4293e-01], [-3.3658e-03, -5.6958e+00], [-5.1760e-01, -9.0621e-01], [-1.0048e+00, -4.5589e-01]])
Then negative log likelihood:
F.nll_loss(log_sm_acts, targ, reduction='none')
tensor([0.5067, 0.6973, 2.0160, 5.6958, 0.9062, 1.0048])
Note: Results are identical
#width 600
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
interp.most_confused(min_val=5)
[('american_pit_bull_terrier', 'staffordshire_bull_terrier', 8), ('Ragdoll', 'Birman', 7), ('Egyptian_Mau', 'Bengal', 5)]
this is our baseline we can start improveing from this point.
Fine tune the model with default arguments:
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1, base_lr=0.1)
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 2.588707 | 4.300000 | 0.445873 | 00:21 |
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 3.385068 | 2.263443 | 0.510825 | 00:26 |
Note: This is where we overshot. Our loss just increase over second epoch is there a better way to find a learning rate?
learn = cnn_learner(dls, resnet34, metrics=error_rate)
suggested_lr= learn.lr_find()
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/callback/schedule.py:270: UserWarning: color is redundantly defined by the 'color' keyword argument and the fmt string "ro" (-> color='r'). The keyword argument will take precedence. ax.plot(val, idx, 'ro', label=nm, c=color)
Warning: There is a discrepancy between lesson and reading group notebooks. In the book we get two values from the function but in reading group, only one. I thing there was an update for this function that not reflected in the book.
suggested_lr
SuggestedLRs(valley=tensor(0.0008))
print(f"suggested: {suggested_lr.valley:.2e}")
suggested: 8.32e-04
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2, base_lr=8.32e-04)
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 2.203637 | 0.456601 | 0.139378 | 00:21 |
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 0.631289 | 0.287444 | 0.087280 | 00:26 |
1 | 0.423191 | 0.263927 | 0.085250 | 00:26 |
At this time it decreases steadily
fine_tune
¶When we create a model from a pretrained network fastai automatically freezes all of the pretrained layers for us. When we call the fine_tune
method fastai does two things:
Lets do it manually
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(3, 8.32e-04)
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 1.806578 | 0.363257 | 0.114344 | 00:21 |
1 | 0.697060 | 0.258624 | 0.083221 | 00:22 |
2 | 0.449906 | 0.254586 | 0.087957 | 00:21 |
learn.unfreeze()
Run the lr_find
again, because having more layers to train, and weights that have already been trained for three epochs, means our previously found learning rate isn't appropriate any more:
learn.lr_find()
SuggestedLRs(valley=tensor(0.0001))
Train again with the new lr.
learn.fit_one_cycle(6, lr_max=0.0001)
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 0.369805 | 0.265072 | 0.085250 | 00:26 |
1 | 0.379721 | 0.352767 | 0.112314 | 00:26 |
2 | 0.320787 | 0.257370 | 0.075778 | 00:26 |
3 | 0.198347 | 0.217450 | 0.066306 | 00:27 |
4 | 0.143628 | 0.217090 | 0.066306 | 00:26 |
5 | 0.111457 | 0.216973 | 0.066306 | 00:27 |
So far so good but there is more way to go
Basically we use variable learning rate for the model. Bigger rate for the later layers and smaller for early layers.
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(3, 8.32e-04)# first lr
learn.unfreeze()
learn.fit_one_cycle(12, lr_max=slice(0.00005,0.0005))#second lr with a range
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 1.783345 | 0.370482 | 0.119080 | 00:22 |
1 | 0.700986 | 0.293102 | 0.096076 | 00:22 |
2 | 0.448751 | 0.262937 | 0.093369 | 00:22 |
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 0.390943 | 0.245929 | 0.079838 | 00:28 |
1 | 0.356807 | 0.281976 | 0.088633 | 00:27 |
2 | 0.344888 | 0.417350 | 0.117727 | 00:27 |
3 | 0.267143 | 0.284152 | 0.081867 | 00:27 |
4 | 0.217775 | 0.330306 | 0.092693 | 00:28 |
5 | 0.172308 | 0.310047 | 0.081191 | 00:27 |
6 | 0.122903 | 0.299161 | 0.079161 | 00:27 |
7 | 0.099924 | 0.262270 | 0.074425 | 00:27 |
8 | 0.059424 | 0.278250 | 0.074425 | 00:27 |
9 | 0.045987 | 0.253283 | 0.067659 | 00:27 |
10 | 0.036630 | 0.251685 | 0.068336 | 00:27 |
11 | 0.034524 | 0.254469 | 0.067659 | 00:27 |
It is better most of the times.(sometimes I don't get good results, need to arrange the slice
values more carefully)
learn.recorder.plot_loss()
Note: Directly from the book:
As you can see, the training loss keeps getting better and better. But notice that eventually the validation loss improvement slows, and sometimes even gets worse! This is the point at which the model is starting to over fit. In particular, the model is becoming overconfident of its predictions. But this does not mean that it is getting less accurate, necessarily. Take a look at the table of training results per epoch, and you will often see that the accuracy continues improving, even as the validation loss gets worse. In the end what matters is your accuracy, or more generally your chosen metrics, not the loss. The loss is just the function we've given the computer to help us to optimize.
Important: I need to think about it how loss increase and accuracy stil becoming better.
In general, a bigger model has the ability to better capture the real underlying relationships in your data, and also to capture and memorize the specific details of your individual images. However, using a deeper model is going to require more GPU RAM, so you may need to lower the size of your batches to avoid an out-of-memory error. This happens when you try to fit too much inside your GPU and looks like:
Cuda runtime error: out of memory
You may have to restart your notebook when this happens. The way to solve it is to use a smaller batch size, which means passing smaller groups of images at any given time through your model. You can pass the batch size you want to the call creating your DataLoaders
with bs=
.
The other downside of deeper architectures is that they take quite a bit longer to train. One technique that can speed things up a lot is mixed-precision training. This refers to using less-precise numbers (half-precision floating point, also called fp16) where possible during training. As we are writing these words in early 2020, nearly all current NVIDIA GPUs support a special feature called tensor cores that can dramatically speed up neural network training, by 2-3x. They also require a lot less GPU memory. To enable this feature in fastai, just add to_fp16()
after your Learner
creation (you also need to import the module).
You can't really know ahead of time what the best architecture for your particular problem is—you need to try training some. So let's try a ResNet-50 now with mixed precision:
from fastai.callback.fp16 import *
learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()
learn.fine_tune(12, freeze_epochs=3)
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 1.209030 | 0.308840 | 0.097429 | 00:20 |
1 | 0.562807 | 0.326714 | 0.100812 | 00:21 |
2 | 0.396488 | 0.263611 | 0.089310 | 00:21 |
epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|
0 | 0.255827 | 0.262954 | 0.080514 | 00:24 |
1 | 0.215601 | 0.256829 | 0.072395 | 00:24 |
2 | 0.238660 | 0.392900 | 0.099459 | 00:23 |
3 | 0.246021 | 0.409503 | 0.107578 | 00:24 |
4 | 0.196632 | 0.448040 | 0.106225 | 00:23 |
5 | 0.137433 | 0.353745 | 0.091340 | 00:23 |
6 | 0.108764 | 0.333932 | 0.085250 | 00:24 |
7 | 0.078872 | 0.295772 | 0.081867 | 00:24 |
8 | 0.055900 | 0.273311 | 0.073072 | 00:24 |
9 | 0.040353 | 0.274645 | 0.070365 | 00:24 |
10 | 0.020883 | 0.260611 | 0.070365 | 00:24 |
11 | 0.021018 | 0.259633 | 0.066982 | 00:24 |
learn.recorder.plot_loss()
As above traing time is not changed much.