I'm a Doctor Who fan and this is my cyberman coffee cup, as I remember got it from Manchester Science Museum.


import fastbook
fastbook.setup_book()
%config Completer.use_jedi = False
from fastbook import *

[[chapter_pet_breeds]]

PLAYING WITH THE DATASET

from fastai.vision.all import *
path = untar_data(URLs.PETS)

Note: With untar we download the data. This data originally come from Oxford University Visual Geomety Group and our dataset is here:
path
Path('/home/niyazi/.fastai/data/oxford-iiit-pet')

Note: This is the local download path for my computer.
Path.BASE_PATH = path

Tip: This is a trick to get the relative path, check above and below
path
Path('.')

Now the path is looks different.

path.ls()
(#2) [Path('annotations'),Path('images')]

Note: #2 is number of item in the list. annotations represents target variables of this datasets but we do not use them at this time instead we create our own labels.
(path/"images").ls()
(#7393) [Path('images/staffordshire_bull_terrier_90.jpg'),Path('images/Russian_Blue_70.jpg'),Path('images/japanese_chin_69.jpg'),Path('images/Maine_Coon_266.jpg'),Path('images/japanese_chin_200.jpg'),Path('images/Siamese_57.jpg'),Path('images/Persian_175.jpg'),Path('images/havanese_81.jpg'),Path('images/Birman_72.jpg'),Path('images/leonberger_55.jpg')...]
fname = (path/"images").ls()[0]
fname
Path('images/staffordshire_bull_terrier_90.jpg')

Note: The first image in the path list.
re.findall(r'(.+)_\d+.jpg$', fname.name)
['staffordshire_bull_terrier']

Note: Since we don’t use the annonations in the Dataset we need to find a way to get breeds form the filename. This is regex findall method, Check geeksforgeeks.org tutorial here
pets = DataBlock(blocks = (ImageBlock, CategoryBlock),
                 get_items=get_image_files, 
                 splitter=RandomSplitter(seed=42),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'),
                 item_tfms=Resize(460),
                 batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls = pets.dataloaders(path/"images")
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/_tensor.py:1023: UserWarning: torch.solve is deprecated in favor of torch.linalg.solveand will be removed in a future PyTorch release.
torch.linalg.solve has its arguments reversed and does not return the LU factorization.
To get the LU factorization see torch.lu, which can be used with torch.lu_solve or torch.lu_unpack.
X = torch.solve(B, A).solution
should be replaced with
X = torch.linalg.solve(A, B) (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/BatchLinearAlgebra.cpp:760.)
  ret = func(*args, **kwargs)

Note: now find all names with RegexLabeller. The item_tmsf and batch_transfdrms may look a bit meaningless. Check below to find out why.

PRESIZING

As a summary FastAi gives a chance to augment our images in a smarter way (presizing) such that provide much more detail and information for the training. First, we presize images with item_tfms then push them to GPU and use augmentation.

check the original document for the whole idea

#caption A comparison of fastai's data augmentation strategy (left) and the traditional approach (right).
dblock1 = DataBlock(blocks=(ImageBlock(), CategoryBlock()),
                   get_y=parent_label,
                   item_tfms=Resize(460))
# Place an image in the 'images/grizzly.jpg' subfolder where this notebook is located before running this
dls1 = dblock1.dataloaders([(Path.cwd()/'images'/'chapter-05'/'grizzly.jpg')]*100, bs=8)
dls1.train.get_idxs = lambda: Inf.ones
x,y = dls1.valid.one_batch()
_,axs = subplots(1, 2)

x1 = TensorImage(x.clone())
x1 = x1.affine_coord(sz=224)
x1 = x1.rotate(draw=30, p=1.)
x1 = x1.zoom(draw=1.2, p=1.)
x1 = x1.warp(draw_x=-0.2, draw_y=0.2, p=1.)

tfms = setup_aug_tfms([Rotate(draw=30, p=1, size=224), Zoom(draw=1.2, p=1., size=224),
                       Warp(draw_x=-0.2, draw_y=0.2, p=1., size=224)])
x = Pipeline(tfms)(x)
#x.affine_coord(coord_tfm=coord_tfm, sz=size, mode=mode, pad_mode=pad_mode)
TensorImage(x[0]).show(ctx=axs[0])
TensorImage(x1[0]).show(ctx=axs[1]);
dls.show_batch(nrows=3, ncols=3)
pets1 = DataBlock(blocks = (ImageBlock, CategoryBlock),
                 get_items=get_image_files, 
                 splitter=RandomSplitter(seed=42),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
pets1.summary(path/"images")
Setting-up type transforms pipelines
Collecting items from /home/niyazi/.fastai/data/oxford-iiit-pet/images
Found 7390 items
2 datasets of sizes 5912,1478
Setting up Pipeline: PILBase.create
Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}

Building one sample
  Pipeline: PILBase.create
    starting from
      /home/niyazi/.fastai/data/oxford-iiit-pet/images/British_Shorthair_110.jpg
    applying PILBase.create gives
      PILImage mode=RGB size=500x333
  Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
    starting from
      /home/niyazi/.fastai/data/oxford-iiit-pet/images/British_Shorthair_110.jpg
    applying partial gives
      British_Shorthair
    applying Categorize -- {'vocab': None, 'sort': True, 'add_na': False} gives
      TensorCategory(4)

Final sample: (PILImage mode=RGB size=500x333, TensorCategory(4))


Collecting items from /home/niyazi/.fastai/data/oxford-iiit-pet/images
Found 7390 items
2 datasets of sizes 5912,1478
Setting up Pipeline: PILBase.create
Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
Setting up after_item: Pipeline: ToTensor
Setting up before_batch: Pipeline: 
Setting up after_batch: Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1}

Building one batch
Applying item_tfms to the first sample:
  Pipeline: ToTensor
    starting from
      (PILImage mode=RGB size=500x333, TensorCategory(4))
    applying ToTensor gives
      (TensorImage of size 3x333x500, TensorCategory(4))

Adding the next 3 samples

No before_batch transform to apply

Collating items in a batch
Error! It's not possible to collate your items in a batch
Could not collate the 0-th members of your tuples because got the following shapes
torch.Size([3, 333, 500]),torch.Size([3, 500, 396]),torch.Size([3, 375, 500]),torch.Size([3, 500, 281])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-15-ead0dd2a047d> in <module>
      3                  splitter=RandomSplitter(seed=42),
      4                  get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
----> 5 pets1.summary(path/"images")

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/block.py in summary(self, source, bs, show_batch, **kwargs)
    188         why = _find_fail_collate(s)
    189         print("Make sure all parts of your samples are tensors of the same size" if why is None else why)
--> 190         raise e
    191 
    192     if len([f for f in dls.train.after_batch.fs if f.name != 'noop'])!=0:

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/block.py in summary(self, source, bs, show_batch, **kwargs)
    182     print("\nCollating items in a batch")
    183     try:
--> 184         b = dls.train.create_batch(s)
    185         b = retain_types(b, s[0] if is_listy(s) else s)
    186     except Exception as e:

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in create_batch(self, b)
    141         elif s is None:  return next(self.it)
    142         else: raise IndexError("Cannot index an iterable dataset numerically - must use `None`.")
--> 143     def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b)
    144     def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)
    145     def to(self, device): self.device = device

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in fa_collate(t)
     48     b = t[0]
     49     return (default_collate(t) if isinstance(b, _collate_types)
---> 50             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     51             else default_collate(t))
     52 

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in <listcomp>(.0)
     48     b = t[0]
     49     return (default_collate(t) if isinstance(b, _collate_types)
---> 50             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     51             else default_collate(t))
     52 

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/data/load.py in fa_collate(t)
     47     "A replacement for PyTorch `default_collate` which maintains types and handles `Sequence`s"
     48     b = t[0]
---> 49     return (default_collate(t) if isinstance(b, _collate_types)
     50             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     51             else default_collate(t))

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     54             storage = elem.storage()._new_shared(numel)
     55             out = elem.new(storage)
---> 56         return torch.stack(batch, 0, out=out)
     57     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
     58             and elem_type.__name__ != 'string_':

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/torch_core.py in __torch_function__(self, func, types, args, kwargs)
    338         convert=False
    339         if _torch_handled(args, self._opt, func): convert,types = type(self),(torch.Tensor,)
--> 340         res = super().__torch_function__(func, types, args=args, kwargs=kwargs)
    341         if convert: res = convert(res)
    342         if isinstance(res, TensorBase): res.set_meta(self, as_copy=True)

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/_tensor.py in __torch_function__(cls, func, types, args, kwargs)
   1021 
   1022         with _C.DisableTorchFunction():
-> 1023             ret = func(*args, **kwargs)
   1024             return _convert(ret, cls)
   1025 

RuntimeError: stack expects each tensor to be equal size, but got [3, 333, 500] at entry 0 and [3, 500, 396] at entry 1

Note: It is alway good to get a quick summary. pets1.summary(path/"images") Check the summary above, it has lots of details. It is natural to get an error in this example because we are trying the put diffent size images into the same DataBlock.

BASELINE MODEL

For every project, just start with a Baseline. Baseline is a good point to think about the project/domain/problem at the same time, then start improve and make experiments about architecture, hyperparameters etc.

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2)
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448278899/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
epoch train_loss valid_loss error_rate time
0 1.513288 0.355303 0.110284 00:22
epoch train_loss valid_loss error_rate time
0 0.518711 0.313168 0.106225 00:27
1 0.325613 0.261644 0.089310 00:27

Note: A basic run is helpful as baseline for the beginning.

Defaults for the baseline

learn.loss_func
FlattenedLoss of CrossEntropyLoss()
learn.lr
0.001

Tip: Very easy to see default arguments for the learner. Above loss function loss_func and learning rate lr.

One Batch Run

first(dls.train)
(TensorImage([[[[ 7.7591e-02, -1.3409e-01,  1.4352e-01,  ..., -8.8188e-01, -8.0163e-01, -1.4735e-01],
           [ 1.9115e-03,  4.8835e-01,  4.3845e-01,  ..., -1.3028e+00, -1.4314e+00, -1.2478e+00],
           [-1.2349e-01,  7.3246e-02, -9.2777e-02,  ..., -7.9699e-01, -1.1984e+00, -9.0709e-02],
           ...,
           [-1.4486e+00, -9.5970e-01,  8.6840e-02,  ..., -1.1097e+00, -3.3829e-01,  8.2527e-02],
           [-1.4246e+00, -8.2784e-01,  8.7511e-02,  ..., -9.5360e-01, -1.0563e-01, -5.1489e-01],
           [-1.3575e+00, -7.6923e-01,  1.0015e-01,  ..., -1.0628e+00,  4.3092e-02, -6.2399e-01]],
 
          [[ 2.5566e-01,  7.5052e-02,  2.0962e-01,  ..., -9.7342e-01, -8.9785e-01, -1.5707e-01],
           [ 8.3578e-02,  6.1146e-01,  5.1947e-01,  ..., -1.3980e+00, -1.5514e+00, -1.3726e+00],
           [-1.2059e-02,  1.2505e-01, -2.9267e-03,  ..., -9.0869e-01, -1.3052e+00, -2.3089e-01],
           ...,
           [-1.4979e+00, -1.1395e+00, -2.8139e-01,  ..., -1.3591e+00, -4.8733e-01, -2.1415e-01],
           [-1.4548e+00, -9.8541e-01, -2.7210e-01,  ..., -1.1278e+00, -3.0796e-01, -8.4852e-01],
           [-1.3689e+00, -9.2548e-01, -2.6808e-01,  ..., -1.2366e+00, -6.3006e-02, -1.0183e+00]],
 
          [[-1.1168e+00, -1.2721e+00, -1.0968e+00,  ..., -1.1363e+00, -9.8121e-01, -3.4084e-01],
           [-1.0031e+00, -6.8494e-01, -8.5066e-01,  ..., -1.5088e+00, -1.6080e+00, -1.4639e+00],
           [-1.1476e+00, -1.0927e+00, -1.3264e+00,  ..., -1.0406e+00, -1.3088e+00, -3.4494e-01],
           ...,
           [-1.4021e+00, -9.7390e-01, -4.7906e-01,  ..., -1.4878e+00, -5.0896e-01, -3.1871e-01],
           [-1.3213e+00, -8.4023e-01, -5.3294e-01,  ..., -1.3262e+00, -5.3787e-01, -1.0765e+00],
           [-1.1781e+00, -8.0876e-01, -5.8936e-01,  ..., -1.3399e+00, -4.2362e-01, -1.1124e+00]]],
 
 
         [[[ 1.9623e+00,  2.0361e+00,  1.9064e+00,  ...,  2.2392e+00,  2.2249e+00,  2.2211e+00],
           [ 2.0734e+00,  2.0294e+00,  2.1349e+00,  ...,  2.2461e+00,  2.2249e+00,  2.2376e+00],
           [ 2.0202e+00,  1.9569e+00,  1.8405e+00,  ...,  2.2373e+00,  2.2353e+00,  2.2223e+00],
           ...,
           [ 3.5436e-01,  2.5449e-01,  5.5067e-01,  ...,  1.0332e+00,  1.0161e+00,  9.8812e-01],
           [ 3.5005e-01,  1.6332e-01,  3.8754e-01,  ...,  9.7724e-01,  9.6458e-01,  1.0630e+00],
           [ 3.4791e-01,  8.2361e-02,  2.2118e-01,  ...,  8.6120e-01,  1.0850e+00,  1.1228e+00]],
 
          [[ 2.1661e+00,  2.2408e+00,  2.0968e+00,  ...,  1.6825e+00,  1.6601e+00,  1.6328e+00],
           [ 2.2785e+00,  2.2232e+00,  2.3262e+00,  ...,  1.6905e+00,  1.6537e+00,  1.6441e+00],
           [ 2.2230e+00,  2.1458e+00,  2.0269e+00,  ...,  1.6834e+00,  1.6742e+00,  1.6310e+00],
           ...,
           [ 8.0271e-01,  7.3445e-01,  1.0265e+00,  ...,  8.7042e-01,  8.4557e-01,  8.8788e-01],
           [ 8.1180e-01,  6.5414e-01,  8.8815e-01,  ...,  8.1193e-01,  7.7473e-01,  9.1055e-01],
           [ 8.1687e-01,  5.4252e-01,  6.8486e-01,  ...,  6.5186e-01,  9.1647e-01,  9.5377e-01]],
 
          [[ 2.3636e+00,  2.4168e+00,  2.2347e+00,  ...,  1.7282e+00,  1.6718e+00,  1.6554e+00],
           [ 2.4760e+00,  2.3875e+00,  2.4613e+00,  ...,  1.7363e+00,  1.6699e+00,  1.6726e+00],
           [ 2.4182e+00,  2.2941e+00,  2.1473e+00,  ...,  1.7294e+00,  1.6948e+00,  1.6592e+00],
           ...,
           [ 1.4156e+00,  1.3690e+00,  1.6562e+00,  ...,  1.2044e+00,  1.2088e+00,  1.2487e+00],
           [ 1.4260e+00,  1.3050e+00,  1.5455e+00,  ...,  1.0673e+00,  1.0365e+00,  1.1483e+00],
           [ 1.4369e+00,  1.2059e+00,  1.3302e+00,  ...,  7.4460e-01,  9.8735e-01,  9.8728e-01]]],
 
 
         [[[ 7.9667e-01,  6.5725e-01,  6.7499e-01,  ...,  2.2489e+00,  2.2489e+00,  2.2489e+00],
           [ 1.6647e+00,  1.8548e+00,  4.2411e-01,  ...,  2.2489e+00,  2.2489e+00,  2.2489e+00],
           [ 2.0417e+00,  2.1499e+00,  1.9243e+00,  ...,  2.2489e+00,  2.2489e+00,  2.2489e+00],
           ...,
           [-7.8885e-02, -8.4444e-02, -1.8854e-01,  ..., -3.3191e-02,  1.6326e-01, -2.5189e-02],
           [-3.9591e-02, -3.7761e-02, -3.5708e-02,  ...,  4.1777e-01,  3.0722e-01, -8.4517e-02],
           [-5.5125e-01, -3.7390e-01, -3.7190e-01,  ...,  1.6706e-01, -3.8756e-02, -3.0213e-01]],
 
          [[ 3.3600e-01,  1.2690e-01,  8.4595e-02,  ...,  2.4286e+00,  2.4286e+00,  2.4286e+00],
           [ 1.3936e+00,  1.5396e+00, -7.8121e-02,  ...,  2.4286e+00,  2.4286e+00,  2.4286e+00],
           [ 1.7230e+00,  1.8204e+00,  1.5281e+00,  ...,  2.4286e+00,  2.4286e+00,  2.4286e+00],
           ...,
           [-2.6621e-01, -3.4865e-01, -5.4389e-01,  ...,  1.5566e-02,  3.6483e-01,  3.7018e-01],
           [-2.3416e-01, -2.9848e-01, -3.8383e-01,  ...,  4.3211e-01,  5.4771e-01,  3.7147e-01],
           [-7.7599e-01, -6.7812e-01, -7.3404e-01,  ...,  2.9308e-01,  2.0118e-01,  3.7493e-02]],
 
          [[-6.9486e-02, -3.3152e-01, -5.6258e-01,  ...,  2.6400e+00,  2.6400e+00,  2.6400e+00],
           [ 9.0693e-01,  9.7337e-01, -5.6124e-01,  ...,  2.6400e+00,  2.6400e+00,  2.6400e+00],
           [ 1.2463e+00,  1.1590e+00,  8.0907e-01,  ...,  2.6400e+00,  2.6400e+00,  2.6400e+00],
           ...,
           [-3.1419e-01, -2.4941e-01, -4.5623e-01,  ..., -6.5955e-01, -6.0038e-01, -8.8913e-01],
           [-2.6903e-01, -2.5050e-01, -3.9344e-01,  ..., -3.7691e-01, -6.0662e-01, -9.9883e-01],
           [-6.3179e-01, -4.3123e-01, -5.1774e-01,  ..., -7.1518e-01, -8.3215e-01, -9.5885e-01]]],
 
 
         ...,
 
 
         [[[ 2.6701e-03,  4.8764e-02,  1.3802e-01,  ..., -3.5556e-01, -2.1186e-01, -6.3790e-02],
           [ 2.7203e-01,  2.9067e-01,  3.0956e-01,  ..., -9.4003e-02, -5.8179e-02, -7.6002e-02],
           [ 3.5114e-01,  3.3277e-01,  3.2004e-01,  ...,  2.0249e-02, -2.6842e-02, -4.4070e-02],
           ...,
           [ 1.9681e+00,  2.0169e+00,  2.0680e+00,  ..., -2.0286e-01,  1.0193e-01,  3.1608e-01],
           [ 1.9411e+00,  2.0085e+00,  2.1026e+00,  ..., -1.3970e-01,  1.2286e-01,  3.5735e-01],
           [ 1.8141e+00,  1.8327e+00,  1.9489e+00,  ..., -1.0404e-01,  1.8111e-01,  3.2454e-01]],
 
          [[ 2.0398e-01,  2.9756e-01,  3.7903e-01,  ..., -1.5909e-02,  4.7189e-02,  1.5181e-01],
           [ 5.4995e-01,  5.8114e-01,  6.0668e-01,  ...,  2.2921e-02,  2.9592e-02,  1.2454e-01],
           [ 6.2037e-01,  6.1136e-01,  6.1487e-01,  ...,  1.3991e-01,  7.2302e-02,  1.2691e-01],
           ...,
           [ 2.1161e+00,  2.1586e+00,  2.1592e+00,  ...,  1.0077e-01,  4.3020e-01,  5.8235e-01],
           [ 2.0806e+00,  2.1535e+00,  2.2194e+00,  ...,  1.1844e-01,  4.4620e-01,  5.9031e-01],
           [ 1.9537e+00,  1.9817e+00,  2.1045e+00,  ...,  1.3738e-01,  4.2917e-01,  6.0165e-01]],
 
          [[-3.0177e-02, -3.0919e-02,  5.6294e-02,  ...,  1.2119e-02,  2.9192e-01,  4.9523e-01],
           [ 1.4675e-01,  1.8120e-01,  2.0599e-01,  ...,  6.5189e-02,  2.1124e-01,  4.7340e-01],
           [ 2.2902e-01,  2.3191e-01,  2.1012e-01,  ...,  1.2057e-01,  1.4622e-01,  3.3338e-01],
           ...,
           [ 2.3455e+00,  2.3984e+00,  2.4285e+00,  ..., -2.0567e-01,  8.8979e-02,  1.8777e-01],
           [ 2.3240e+00,  2.3670e+00,  2.4654e+00,  ..., -1.8698e-01,  1.2802e-01,  2.0268e-01],
           [ 2.2811e+00,  2.2660e+00,  2.3926e+00,  ..., -1.4246e-01,  1.2407e-01,  2.1404e-01]]],
 
 
         [[[ 2.1948e+00,  2.1682e+00,  2.1729e+00,  ..., -8.0144e-02, -1.7157e-01, -2.2714e-01],
           [ 2.1642e+00,  2.1482e+00,  2.1615e+00,  ..., -2.8867e-01, -3.3114e-01, -4.2198e-01],
           [ 2.1637e+00,  2.1500e+00,  2.1554e+00,  ..., -5.3515e-01, -4.0755e-01, -3.9795e-01],
           ...,
           [ 1.0268e+00,  1.0389e+00,  1.0086e+00,  ...,  2.1637e+00,  2.1637e+00,  2.1637e+00],
           [ 1.0284e+00,  1.0010e+00,  1.0453e+00,  ...,  2.1637e+00,  2.1637e+00,  2.1637e+00],
           [ 1.0163e+00,  1.0296e+00,  1.0190e+00,  ...,  2.1637e+00,  2.1637e+00,  2.1637e+00]],
 
          [[ 2.3155e+00,  2.2723e+00,  2.2853e+00,  ...,  2.2712e-01,  2.1174e-01,  1.6661e-01],
           [ 2.2455e+00,  2.2432e+00,  2.2569e+00,  ...,  7.0896e-02,  4.8820e-02, -4.3276e-02],
           [ 2.2668e+00,  2.2566e+00,  2.2656e+00,  ..., -1.1109e-01, -5.4757e-02, -8.3705e-02],
           ...,
           [ 1.1080e+00,  1.0955e+00,  1.0469e+00,  ...,  2.2754e+00,  2.2754e+00,  2.2754e+00],
           [ 1.1009e+00,  1.0698e+00,  1.0872e+00,  ...,  2.2754e+00,  2.2754e+00,  2.2754e+00],
           [ 1.0910e+00,  1.1021e+00,  1.0618e+00,  ...,  2.2754e+00,  2.2754e+00,  2.2754e+00]],
 
          [[ 2.5272e+00,  2.4830e+00,  2.4523e+00,  ...,  8.5871e-01,  8.1094e-01,  7.8095e-01],
           [ 2.4560e+00,  2.4281e+00,  2.4305e+00,  ...,  5.5803e-01,  4.9335e-01,  3.9940e-01],
           [ 2.4629e+00,  2.4525e+00,  2.4602e+00,  ...,  1.7717e-01,  2.3084e-01,  2.2868e-01],
           ...,
           [ 1.1968e+00,  1.1985e+00,  1.1224e+00,  ...,  2.4712e+00,  2.4712e+00,  2.4712e+00],
           [ 1.2228e+00,  1.1839e+00,  1.1597e+00,  ...,  2.4712e+00,  2.4712e+00,  2.4712e+00],
           [ 1.2212e+00,  1.2520e+00,  1.1560e+00,  ...,  2.4712e+00,  2.4712e+00,  2.4712e+00]]],
 
 
         [[[ 2.2489e+00,  2.2489e+00,  2.2489e+00,  ...,  2.2312e+00,  2.2403e+00,  2.2489e+00],
           [ 2.2489e+00,  2.2489e+00,  2.2489e+00,  ...,  2.2320e+00,  2.2485e+00,  2.2489e+00],
           [ 2.2489e+00,  2.2489e+00,  2.2489e+00,  ...,  2.2291e+00,  2.2471e+00,  2.2489e+00],
           ...,
           [-1.7937e+00, -1.9148e+00, -1.9569e+00,  ...,  7.9287e-01,  6.7453e-01,  7.8103e-01],
           [-1.6935e+00, -1.8518e+00, -1.8703e+00,  ...,  4.8874e-01,  2.1611e-01,  1.1217e-01],
           [-1.6270e+00, -1.8958e+00, -1.8929e+00,  ...,  8.0896e-01,  8.9964e-01,  1.0060e+00]],
 
          [[ 2.4286e+00,  2.4286e+00,  2.4286e+00,  ...,  2.4109e+00,  2.4200e+00,  2.4286e+00],
           [ 2.4286e+00,  2.4286e+00,  2.4286e+00,  ...,  2.4113e+00,  2.4282e+00,  2.4286e+00],
           [ 2.4286e+00,  2.4286e+00,  2.4286e+00,  ...,  2.4083e+00,  2.4268e+00,  2.4286e+00],
           ...,
           [-1.4270e+00, -1.6502e+00, -1.6874e+00,  ...,  7.1551e-01,  4.9632e-01,  6.5896e-01],
           [-1.2436e+00, -1.4754e+00, -1.4859e+00,  ...,  2.8340e-01, -9.5874e-02, -1.2210e-01],
           [-1.1076e+00, -1.4404e+00, -1.4846e+00,  ...,  6.4517e-01,  7.3422e-01,  8.5824e-01]],
 
          [[ 2.6400e+00,  2.6400e+00,  2.6400e+00,  ...,  2.6223e+00,  2.6305e+00,  2.6147e+00],
           [ 2.6400e+00,  2.6400e+00,  2.6400e+00,  ...,  2.6228e+00,  2.6396e+00,  2.6398e+00],
           [ 2.6400e+00,  2.6400e+00,  2.6400e+00,  ...,  2.6198e+00,  2.6382e+00,  2.6400e+00],
           ...,
           [-1.0548e+00, -1.1392e+00, -1.2386e+00,  ...,  6.2691e-01,  3.3431e-01,  4.8703e-01],
           [-8.5461e-01, -8.5948e-01, -9.4681e-01,  ...,  2.0497e-01, -1.5078e-01, -2.6161e-01],
           [-9.2614e-01, -8.7474e-01, -8.2363e-01,  ...,  6.1918e-01,  7.6773e-01,  8.3740e-01]]]], device='cuda:0'),
 TensorCategory([25,  4, 27, 20, 12, 27, 31, 33, 14, 35, 16,  5, 22, 33,  3, 35,  3,  0, 32, 12,  1, 20, 18, 22, 15, 11, 13,  5, 35,  4, 22, 34, 15,  4,  3, 21,  5, 22, 27, 11, 15, 13, 14, 32, 13,  4,  7, 30,
          9, 20,  7, 20,  9,  1,  6, 35, 23,  8, 14, 16, 18,  6,  2, 35], device='cuda:0'))

Note: above and below is same
x,y = dls.one_batch()

Understanding Labels

dls.vocab
['Abyssinian', 'Bengal', 'Birman', 'Bombay', 'British_Shorthair', 'Egyptian_Mau', 'Maine_Coon', 'Persian', 'Ragdoll', 'Russian_Blue', 'Siamese', 'Sphynx', 'american_bulldog', 'american_pit_bull_terrier', 'basset_hound', 'beagle', 'boxer', 'chihuahua', 'english_cocker_spaniel', 'english_setter', 'german_shorthaired', 'great_pyrenees', 'havanese', 'japanese_chin', 'keeshond', 'leonberger', 'miniature_pinscher', 'newfoundland', 'pomeranian', 'pug', 'saint_bernard', 'samoyed', 'scottish_terrier', 'shiba_inu', 'staffordshire_bull_terrier', 'wheaten_terrier', 'yorkshire_terrier']
dls.vocab[0]
'Abyssinian'

Tip: vocab gives as all labels as text.

What's inside the tensors?

y
TensorCategory([13, 35,  8, 36,  3, 10, 10, 14, 22,  1,  5,  5,  5,  0,  4,  7, 11, 33, 18, 25, 20,  3, 33,  0, 25, 15, 27,  9, 17, 25, 19, 26,  9,  0, 35,  5,  6,  1, 31, 14,  7,  9,  8, 27,  2,  7, 21, 13,
        26, 17, 25, 30, 31,  5, 19, 17,  4, 12, 29,  8, 21, 33, 18,  9], device='cuda:0')

Note: Targets as coded.
x
TensorImage([[[[-1.3790, -1.3778, -1.3984,  ..., -0.1093,  0.1460, -0.0339],
          [-1.3243, -1.3580, -1.3804,  ...,  0.0449,  0.0572, -0.0411],
          [-1.3337, -1.3652, -1.3996,  ..., -0.0918, -0.1107, -0.0857],
          ...,
          [-0.4574, -0.3503, -0.3927,  ..., -0.6010, -0.7011, -0.7119],
          [-0.3509, -0.1960, -0.2069,  ..., -0.6884, -0.6634, -0.6341],
          [-0.3221, -0.3299, -0.3177,  ..., -0.5625, -0.4453, -0.4082]],

         [[-1.6744, -1.6758, -1.6812,  ..., -0.5800, -0.2989, -0.4613],
          [-1.5871, -1.6285, -1.6568,  ..., -0.3977, -0.3745, -0.4682],
          [-1.5626, -1.6162, -1.6632,  ..., -0.5169, -0.5306, -0.5112],
          ...,
          [-1.0813, -0.9612, -0.9992,  ..., -1.1308, -1.2921, -1.3514],
          [-0.9441, -0.7857, -0.7948,  ..., -1.2437, -1.2741, -1.3164],
          [-0.9350, -0.9371, -0.9192,  ..., -1.1914, -1.1397, -1.1095]],

         [[-1.7511, -1.7434, -1.7629,  ..., -0.6031, -0.3360, -0.5057],
          [-1.6791, -1.7414, -1.7652,  ..., -0.4428, -0.4310, -0.5192],
          [-1.6424, -1.7149, -1.7630,  ..., -0.5840, -0.6148, -0.5747],
          ...,
          [-1.6312, -1.4844, -1.5600,  ..., -1.3854, -1.6552, -1.7876],
          [-1.4652, -1.2946, -1.3208,  ..., -1.5283, -1.6561, -1.7149],
          [-1.4120, -1.4189, -1.4288,  ..., -1.5720, -1.5691, -1.5458]]],


        [[[-1.1709, -1.0320, -0.3882,  ..., -2.0315, -2.0706, -2.0406],
          [-0.7207, -1.2576, -0.8119,  ..., -2.0559, -2.0684, -2.0728],
          [-0.2858, -0.7315, -1.1736,  ..., -2.0433, -2.0766, -2.0962],
          ...,
          [-0.6050, -0.6222, -0.7002,  ...,  0.0488,  0.0771,  0.0815],
          [-0.6429, -0.6763, -0.7053,  ...,  0.1518, -0.0409,  0.1402],
          [-0.6518, -0.7125, -0.7378,  ...,  0.1249,  0.0496,  0.1191]],

         [[-0.8777, -0.6010,  0.1436,  ..., -1.6740, -1.8200, -1.7518],
          [-0.3933, -0.8562, -0.2790,  ..., -1.7451, -1.8429, -1.8417],
          [ 0.0344, -0.4096, -0.7499,  ..., -1.8033, -1.8966, -1.8918],
          ...,
          [-0.6012, -0.6491, -0.6763,  ...,  0.2168,  0.2401,  0.2356],
          [-0.6647, -0.7301, -0.6992,  ...,  0.3333,  0.1283,  0.3030],
          [-0.6832, -0.7672, -0.7454,  ...,  0.2804,  0.2115,  0.2686]],

         [[-0.6769, -0.6395,  0.1047,  ..., -1.7038, -1.7277, -1.7018],
          [-0.1480, -0.7290, -0.2868,  ..., -1.7481, -1.7221, -1.7096],
          [ 0.2469, -0.1654, -0.6479,  ..., -1.7517, -1.7447, -1.7343],
          ...,
          [-0.5935, -0.7393, -0.8030,  ...,  0.4358,  0.4557,  0.3989],
          [-0.6801, -0.8271, -0.8223,  ...,  0.5317,  0.3682,  0.4378],
          [-0.7225, -0.8686, -0.8416,  ...,  0.4712,  0.3960,  0.3788]]],


        [[[ 0.4054,  0.4157,  0.4160,  ...,  0.4003,  0.3259,  0.1861],
          [ 0.4487,  0.4643,  0.4645,  ...,  0.3877,  0.2998,  0.1810],
          [ 0.4725,  0.4946,  0.4947,  ...,  0.3752,  0.2639,  0.1651],
          ...,
          [-0.5970, -0.4684, -0.6175,  ...,  1.5827,  1.6680,  1.6609],
          [-0.5992, -0.5694, -0.4757,  ...,  1.2104,  1.3103,  1.3862],
          [-0.6427, -0.7463, -0.7177,  ...,  0.8878,  0.8792,  1.0099]],

         [[ 0.9298,  0.9404,  0.9407,  ...,  0.9246,  0.8562,  0.7873],
          [ 0.9741,  0.9901,  0.9902,  ...,  0.9117,  0.8413,  0.7847],
          [ 0.9984,  1.0209,  1.0210,  ...,  0.8998,  0.8208,  0.7683],
          ...,
          [-0.1210,  0.0165, -0.1429,  ...,  1.7798,  1.8600,  1.8469],
          [-0.1235, -0.0915,  0.0087,  ...,  1.4962,  1.5836,  1.6497],
          [-0.1703, -0.2819, -0.2510,  ...,  1.2888,  1.2676,  1.3861]],

         [[ 1.4446,  1.4550,  1.4553,  ...,  1.4395,  1.4119,  1.4488],
          [ 1.4881,  1.5037,  1.5038,  ...,  1.4270,  1.4177,  1.4495],
          [ 1.5118,  1.5338,  1.5339,  ...,  1.4189,  1.4155,  1.4334],
          ...,
          [ 0.3907,  0.5309,  0.3682,  ...,  2.0347,  2.1100,  2.0935],
          [ 0.3881,  0.4208,  0.5230,  ...,  1.8240,  1.9038,  1.9552],
          [ 0.3403,  0.2258,  0.2576,  ...,  1.6937,  1.6618,  1.7696]]],


        ...,


        [[[-0.7000, -0.6986, -0.7306,  ..., -1.6313, -1.7078, -1.6480],
          [-0.6932, -0.6908, -0.7237,  ..., -1.5230, -1.6776, -1.6388],
          [-0.6817, -0.6618, -0.6940,  ..., -1.2779, -1.5299, -1.5905],
          ...,
          [ 0.7912,  0.9871,  0.9476,  ...,  0.7637,  0.8491,  0.8404],
          [ 0.6435,  0.6536,  0.5389,  ...,  0.2268,  0.2840,  0.7647],
          [ 0.2871,  0.1747,  0.0099,  ...,  0.1704,  0.2587,  0.7739]],

         [[-0.5360, -0.5255, -0.5556,  ..., -1.7426, -1.8394, -1.8137],
          [-0.5545, -0.5179, -0.5420,  ..., -1.6711, -1.7999, -1.8003],
          [-0.5629, -0.4964, -0.5208,  ..., -1.4790, -1.7004, -1.7559],
          ...,
          [ 0.8302,  1.0338,  0.9972,  ...,  0.5573,  0.6361,  0.5920],
          [ 0.6521,  0.6515,  0.5315,  ..., -0.0175,  0.0396,  0.5232],
          [ 0.2739,  0.1468, -0.0415,  ..., -0.0737,  0.0125,  0.5333]],

         [[-0.1682, -0.1622, -0.1925,  ..., -1.7371, -1.7819, -1.7964],
          [-0.1791, -0.1565, -0.1929,  ..., -1.7217, -1.7671, -1.7829],
          [-0.1814, -0.1405, -0.1879,  ..., -1.6271, -1.7128, -1.7439],
          ...,
          [ 0.7712,  1.0543,  1.0645,  ...,  0.3954,  0.4138,  0.3209],
          [ 0.5921,  0.6752,  0.5766,  ..., -0.2356, -0.1997,  0.2561],
          [ 0.2282,  0.1419, -0.0471,  ..., -0.3060, -0.2291,  0.2662]]],


        [[[-1.7039, -1.5743, -0.7309,  ..., -1.4899, -1.5223, -1.7014],
          [-1.6543, -1.3360, -0.7863,  ..., -1.4862, -1.4992, -1.6543],
          [-1.4747, -0.9363, -0.9740,  ..., -1.4786, -1.7038, -1.7715],
          ...,
          [-1.0359, -0.9016, -0.9339,  ...,  1.1125,  1.1213,  0.7437],
          [-0.9960, -1.1363, -1.0869,  ...,  0.8008,  1.0108,  0.9147],
          [-1.1167, -1.2009, -1.1964,  ...,  0.6893,  1.3224,  0.2577]],

         [[-1.5970, -1.4079, -0.4739,  ..., -1.2143, -1.3011, -1.4974],
          [-1.5914, -1.1409, -0.5364,  ..., -1.1936, -1.2596, -1.4214],
          [-1.4410, -0.7377, -0.7863,  ..., -1.1843, -1.4621, -1.5348],
          ...,
          [-0.6342, -0.4864, -0.5238,  ...,  1.4785,  1.5627,  1.1883],
          [-0.6105, -0.7573, -0.7006,  ...,  1.0617,  1.3020,  1.2896],
          [-0.7374, -0.8229, -0.8292,  ...,  0.9343,  1.6271,  0.5592]],

         [[-1.7608, -1.7676, -1.6709,  ..., -1.7209, -1.7915, -1.7875],
          [-1.7147, -1.7388, -1.6955,  ..., -1.7866, -1.7723, -1.7779],
          [-1.6931, -1.5459, -1.6110,  ..., -1.7689, -1.7772, -1.7837],
          ...,
          [-1.5969, -1.4913, -1.5156,  ..., -0.6912, -0.3672, -0.7122],
          [-1.5018, -1.6808, -1.6485,  ..., -0.5699, -0.2657, -0.5272],
          [-1.5775, -1.6876, -1.6868,  ..., -0.0425,  0.4928, -0.8241]]],


        [[[ 1.4877,  1.4668,  1.5048,  ...,  2.0390,  2.0327,  2.0266],
          [ 1.5198,  1.4866,  1.5066,  ...,  2.0343,  2.0305,  2.0289],
          [ 1.5237,  1.4689,  1.5328,  ...,  2.0266,  2.0217,  2.0152],
          ...,
          [ 1.4687,  1.8343,  1.9416,  ..., -1.3055, -1.2876, -1.3210],
          [ 1.8619,  1.9001,  1.8640,  ..., -1.1670, -1.3076, -1.3698],
          [ 1.8915,  1.8637,  1.8954,  ..., -1.1410, -1.3742, -1.3414]],

         [[ 1.8535,  1.8615,  1.8702,  ...,  2.2140,  2.2075,  2.2014],
          [ 1.8948,  1.9005,  1.8962,  ...,  2.2092,  2.2053,  2.2037],
          [ 1.8926,  1.8579,  1.8944,  ...,  2.2014,  2.2056,  2.2128],
          ...,
          [ 1.8888,  2.1500,  2.2057,  ..., -1.2930, -1.2397, -1.3169],
          [ 2.2001,  2.2026,  2.1672,  ..., -1.2065, -1.2288, -1.3877],
          [ 2.1885,  2.1623,  2.2023,  ..., -1.2037, -1.3027, -1.3650]],

         [[ 2.0973,  2.0843,  2.1081,  ...,  2.4264,  2.4200,  2.4138],
          [ 2.1341,  2.1351,  2.1528,  ...,  2.4216,  2.4177,  2.4161],
          [ 2.1146,  2.0894,  2.1310,  ...,  2.4138,  2.4146,  2.4143],
          ...,
          [ 2.3444,  2.4653,  2.4575,  ..., -1.1536, -1.0986, -1.1590],
          [ 2.5056,  2.5002,  2.4774,  ..., -1.0442, -1.1040, -1.1783],
          [ 2.5015,  2.5031,  2.5432,  ..., -1.0341, -1.1608, -1.1347]]]], device='cuda:0')

Note: Our stacked image tensor.

Predictions of the baseline model.

preds,_ = learn.get_preds(dl=[(x,y)])
preds[0]
tensor([1.4670e-06, 1.2070e-06, 8.4748e-07, 1.6964e-07, 7.0972e-06, 9.3213e-07, 1.9146e-06, 4.0787e-07, 1.3208e-06, 1.8394e-06, 1.8446e-08, 2.0282e-05, 9.3669e-04, 9.9753e-01, 4.6090e-06, 4.6171e-05,
        8.3924e-05, 4.4448e-04, 3.7151e-07, 7.7943e-07, 6.8438e-06, 7.1965e-07, 2.7995e-07, 1.9403e-06, 1.0657e-06, 7.8017e-07, 1.8254e-05, 5.4245e-06, 4.5678e-06, 8.7494e-07, 3.8811e-06, 1.2178e-06,
        6.4576e-07, 1.8837e-05, 8.5143e-04, 1.4807e-06, 1.7899e-06])

Note: result for first item that adds up to one. There are 37 outputs for 37 image categories and the results are in percentage for probability of each category.
_
TensorCategory([13, 35,  8, 36,  3, 10, 10, 14, 22,  1,  5,  5,  5,  0,  4,  7, 11, 33, 18, 25, 20,  3, 33,  0, 25, 15, 27,  9, 17, 25, 19, 26,  9,  0, 35,  5,  6,  1, 31, 14,  7,  9,  8, 27,  2,  7, 21, 13,
        26, 17, 25, 30, 31,  5, 19, 17,  4, 12, 29,  8, 21, 33, 18,  9])

Note: Category codes
len(preds[0]),preds[0].sum()
(37, tensor(1.0000))

Prediction for 37 categories that adds up to one.


FUNCTION FOR CLASSIFIYING MORE THAN TWO CATEGORY

For classifiying more than two category, we need to employ a new function. It is not totally different than sigmoid, in fact it starts with a sigmoid function.

plot_function(torch.sigmoid, min=-4,max=4)
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastbook/__init__.py:73: UserWarning: Not providing a value for linspace's steps is deprecated and will throw a runtime error in a future release. This warning will appear only once per process. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/RangeFactories.cpp:25.)
  x = torch.linspace(min,max)

Note: This is how torch.sigmoid squishes values between 0 and 1.
torch.random.manual_seed(42);
acts = torch.randn((6,2))*2
acts
tensor([[ 0.6734,  0.2576],
        [ 0.4689,  0.4607],
        [-2.2457, -0.3727],
        [ 4.4164, -1.2760],
        [ 0.9233,  0.5347],
        [ 1.0698,  1.6187]])

Note: These are random numbers represent binary results of a hypothetical network. First colums represent 3’s the and second is 7’s standart deviation of 2. It generally shows how confident the model about the predictions.
acts.sigmoid()
tensor([[0.6623, 0.5641],
        [0.6151, 0.6132],
        [0.0957, 0.4079],
        [0.9881, 0.2182],
        [0.7157, 0.6306],
        [0.7446, 0.8346]])

Note: If we apply the sigmoid, the result become like this(above). Obviously they aren’t adds up to one. These are relative confidence over inputs. For example first row says: it’s a three. But what is the probability? It is not clear.
(acts[:,0]-acts[:,1]).sigmoid()
tensor([0.6025, 0.5021, 0.1332, 0.9966, 0.5959, 0.3661])

Note: If we take the difference between these relative confidence the results become like this above: Now we can say that for the first item, model is 0.6025 (%60.25) confident.

this part is a bit different in the lesson video. so check the video. 1:35:20

sm_acts = torch.softmax(acts, dim=1)
sm_acts
tensor([[0.6025, 0.3975],
        [0.5021, 0.4979],
        [0.1332, 0.8668],
        [0.9966, 0.0034],
        [0.5959, 0.4041],
        [0.3661, 0.6339]])

Note: torch.softmax does that in one step. Now results for each item adds up to one and identical.

Log Likelihood

targ = tensor([0,1,0,1,1,0])

this is our softmax activations:

sm_acts
tensor([[0.6025, 0.3975],
        [0.5021, 0.4979],
        [0.1332, 0.8668],
        [0.9966, 0.0034],
        [0.5959, 0.4041],
        [0.3661, 0.6339]])
idx = range(6)
sm_acts[idx, targ]
tensor([0.6025, 0.4979, 0.1332, 0.0034, 0.4041, 0.3661])

Note: Nice trick for getting confidence level for each item.

lets see everything in a table:

from IPython.display import HTML
df = pd.DataFrame(sm_acts, columns=["3","7"])
df['targ'] = targ
df['idx'] = idx
df['loss'] = sm_acts[range(6), targ]
t = df.style.hide_index()
#To have html code compatible with our script
html = t._repr_html_().split('</style>')[1]
html = re.sub(r'<table id="([^"]+)"\s*>', r'<table >', html)
display(HTML(html))
3 7 targ idx loss
0.602469 0.397531 0 0 0.602469
0.502065 0.497935 1 1 0.497935
0.133188 0.866811 0 2 0.133188
0.996640 0.003360 1 3 0.003360
0.595949 0.404051 1 4 0.404051
0.366118 0.633882 0 5 0.366118

Warning: I think the last label is wrong here. It must be the confidence instead.
-sm_acts[idx, targ]
tensor([-0.6025, -0.4979, -0.1332, -0.0034, -0.4041, -0.3661])

Warning: There is a caveat here. These are neg of our confidence level, not loss.

Pytorch way of doing the same here:

F.nll_loss(sm_acts, targ, reduction='none')
tensor([-0.6025, -0.4979, -0.1332, -0.0034, -0.4041, -0.3661])

Note: Anyway, numbers are still not right, that will be addresses in the Taking the Log section below. The reason is F.nll_loss (negative log likelihood loss) needs arguments such that log is already applied to make the calculation right.(loss)

Taking the Log

Note: Directly from the book:

Important: Confusing Name, Beware: The nll in nll_loss stands for "negative log likelihood," but it doesn’t actually take the log at all! It assumes you have already taken the log. PyTorch has a function called log_softmax that combines log and softmax in a fast and accurate way. nll_loss is designed to be used after log_softmax.

When we first take the softmax, and then the log likelihood of that, that combination is called cross-entropy loss. In PyTorch, this is available as nn.CrossEntropyLoss (which, in practice, actually does log_softmax and then nll_loss):

pytorch's crossEntropy:

loss_func = nn.CrossEntropyLoss()
loss_func(acts, targ)
tensor(1.8045)

or:

F.cross_entropy(acts, targ)
tensor(1.8045)

Note: this is the mean of all losses

and this is all results without taking the mean:

nn.CrossEntropyLoss(reduction='none')(acts, targ)
tensor([0.5067, 0.6973, 2.0160, 5.6958, 0.9062, 1.0048])

Note: Results above are cross entrophy loss for each image in the list (of course our current numbers are fake numbers)

Manual calculation log_softmax + nll_loss

First log_softmax:

log_sm_acts = torch.log_softmax(acts, dim=1)
log_sm_acts
tensor([[-5.0672e-01, -9.2248e-01],
        [-6.8903e-01, -6.9729e-01],
        [-2.0160e+00, -1.4293e-01],
        [-3.3658e-03, -5.6958e+00],
        [-5.1760e-01, -9.0621e-01],
        [-1.0048e+00, -4.5589e-01]])

Then negative log likelihood:

F.nll_loss(log_sm_acts, targ, reduction='none')
tensor([0.5067, 0.6973, 2.0160, 5.6958, 0.9062, 1.0048])

Note: Results are identical

REVISITING THE BASELINE MODEL (Model Interpretation)

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
interp.most_confused(min_val=5)
[('american_pit_bull_terrier', 'staffordshire_bull_terrier', 8),
 ('Ragdoll', 'Birman', 7),
 ('Egyptian_Mau', 'Bengal', 5)]

this is our baseline we can start improveing from this point.


IMPROVING THE MODEL

Fine Tune

Fine tune the model with default arguments:

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1, base_lr=0.1)
epoch train_loss valid_loss error_rate time
0 2.588707 4.300000 0.445873 00:21
epoch train_loss valid_loss error_rate time
0 3.385068 2.263443 0.510825 00:26

Note: This is where we overshot. Our loss just increase over second epoch is there a better way to find a learning rate?

Learning Rate Finder

learn = cnn_learner(dls, resnet34, metrics=error_rate)
suggested_lr= learn.lr_find()
/home/niyazi/anaconda3/envs/fastbook/lib/python3.8/site-packages/fastai/callback/schedule.py:270: UserWarning: color is redundantly defined by the 'color' keyword argument and the fmt string "ro" (-> color='r'). The keyword argument will take precedence.
  ax.plot(val, idx, 'ro', label=nm, c=color)

Warning: There is a discrepancy between lesson and reading group notebooks. In the book we get two values from the function but in reading group, only one. I thing there was an update for this function that not reflected in the book.
suggested_lr
SuggestedLRs(valley=tensor(0.0008))
print(f"suggested: {suggested_lr.valley:.2e}")
suggested: 8.32e-04
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2, base_lr=8.32e-04)
epoch train_loss valid_loss error_rate time
0 2.203637 0.456601 0.139378 00:21
epoch train_loss valid_loss error_rate time
0 0.631289 0.287444 0.087280 00:26
1 0.423191 0.263927 0.085250 00:26

At this time it decreases steadily

What's under the hood of fine_tune

When we create a model from a pretrained network fastai automatically freezes all of the pretrained layers for us. When we call the fine_tune method fastai does two things:

  • Trains the randomly added layers for one epoch, with all other layers frozen
  • Unfreezes all of the layers, and trains them all for the number of epochs requested

Lets do it manually

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(3, 8.32e-04)
epoch train_loss valid_loss error_rate time
0 1.806578 0.363257 0.114344 00:21
1 0.697060 0.258624 0.083221 00:22
2 0.449906 0.254586 0.087957 00:21
learn.unfreeze()

Run the lr_find again, because having more layers to train, and weights that have already been trained for three epochs, means our previously found learning rate isn't appropriate any more:

learn.lr_find()
SuggestedLRs(valley=tensor(0.0001))

Train again with the new lr.

learn.fit_one_cycle(6, lr_max=0.0001)
epoch train_loss valid_loss error_rate time
0 0.369805 0.265072 0.085250 00:26
1 0.379721 0.352767 0.112314 00:26
2 0.320787 0.257370 0.075778 00:26
3 0.198347 0.217450 0.066306 00:27
4 0.143628 0.217090 0.066306 00:26
5 0.111457 0.216973 0.066306 00:27

So far so good but there is more way to go


Discriminative Learning Rates

Basically we use variable learning rate for the model. Bigger rate for the later layers and smaller for early layers.

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(3, 8.32e-04)# first lr
learn.unfreeze()
learn.fit_one_cycle(12, lr_max=slice(0.00005,0.0005))#second lr with a range
epoch train_loss valid_loss error_rate time
0 1.783345 0.370482 0.119080 00:22
1 0.700986 0.293102 0.096076 00:22
2 0.448751 0.262937 0.093369 00:22
epoch train_loss valid_loss error_rate time
0 0.390943 0.245929 0.079838 00:28
1 0.356807 0.281976 0.088633 00:27
2 0.344888 0.417350 0.117727 00:27
3 0.267143 0.284152 0.081867 00:27
4 0.217775 0.330306 0.092693 00:28
5 0.172308 0.310047 0.081191 00:27
6 0.122903 0.299161 0.079161 00:27
7 0.099924 0.262270 0.074425 00:27
8 0.059424 0.278250 0.074425 00:27
9 0.045987 0.253283 0.067659 00:27
10 0.036630 0.251685 0.068336 00:27
11 0.034524 0.254469 0.067659 00:27

It is better most of the times.(sometimes I don't get good results, need to arrange the slice values more carefully)

learn.recorder.plot_loss()

Note: Directly from the book:

As you can see, the training loss keeps getting better and better. But notice that eventually the validation loss improvement slows, and sometimes even gets worse! This is the point at which the model is starting to over fit. In particular, the model is becoming overconfident of its predictions. But this does not mean that it is getting less accurate, necessarily. Take a look at the table of training results per epoch, and you will often see that the accuracy continues improving, even as the validation loss gets worse. In the end what matters is your accuracy, or more generally your chosen metrics, not the loss. The loss is just the function we've given the computer to help us to optimize.

Important: I need to think about it how loss increase and accuracy stil becoming better.

Deeper Architectures

In general, a bigger model has the ability to better capture the real underlying relationships in your data, and also to capture and memorize the specific details of your individual images. However, using a deeper model is going to require more GPU RAM, so you may need to lower the size of your batches to avoid an out-of-memory error. This happens when you try to fit too much inside your GPU and looks like:

Cuda runtime error: out of memory

You may have to restart your notebook when this happens. The way to solve it is to use a smaller batch size, which means passing smaller groups of images at any given time through your model. You can pass the batch size you want to the call creating your DataLoaders with bs=.

The other downside of deeper architectures is that they take quite a bit longer to train. One technique that can speed things up a lot is mixed-precision training. This refers to using less-precise numbers (half-precision floating point, also called fp16) where possible during training. As we are writing these words in early 2020, nearly all current NVIDIA GPUs support a special feature called tensor cores that can dramatically speed up neural network training, by 2-3x. They also require a lot less GPU memory. To enable this feature in fastai, just add to_fp16() after your Learner creation (you also need to import the module).

You can't really know ahead of time what the best architecture for your particular problem is—you need to try training some. So let's try a ResNet-50 now with mixed precision:

from fastai.callback.fp16 import *
learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()
learn.fine_tune(12, freeze_epochs=3)
epoch train_loss valid_loss error_rate time
0 1.209030 0.308840 0.097429 00:20
1 0.562807 0.326714 0.100812 00:21
2 0.396488 0.263611 0.089310 00:21
epoch train_loss valid_loss error_rate time
0 0.255827 0.262954 0.080514 00:24
1 0.215601 0.256829 0.072395 00:24
2 0.238660 0.392900 0.099459 00:23
3 0.246021 0.409503 0.107578 00:24
4 0.196632 0.448040 0.106225 00:23
5 0.137433 0.353745 0.091340 00:23
6 0.108764 0.333932 0.085250 00:24
7 0.078872 0.295772 0.081867 00:24
8 0.055900 0.273311 0.073072 00:24
9 0.040353 0.274645 0.070365 00:24
10 0.020883 0.260611 0.070365 00:24
11 0.021018 0.259633 0.066982 00:24
learn.recorder.plot_loss()

As above traing time is not changed much.