%reload_ext autoreload
%autoreload 2
%matplotlib inline
from fastai import *
from fastai.vision import *
The planet dataset isn't available on the fastai dataset page due to copyright restrictions. You can download it from Kaggle however. Let's see how to do this by using the Kaggle API as it's going to be pretty useful to you if you want to join a competition or use other Kaggle datasets later on.
First, install the Kaggle API by uncommenting the following line and executing it, or by executing it in your terminal (depending on your platform you may need to modify this slightly to either add source activate fastai
or similar, or prefix pip
with a path. Have a look at how conda install
is called for your platform in the appropriate Returning to work section of https://course-v3.fast.ai/. (Depending on your environment, you may also need to append "--user" to the command.)
! pip install kaggle --upgrade
Collecting kaggle Downloading https://files.pythonhosted.org/packages/9e/94/5370052b9cbc63a927bda08c4f7473a35d3bb27cc071baa1a83b7f783352/kaggle-1.5.1.1.tar.gz (53kB) 100% |████████████████████████████████| 61kB 2.6MB/s ta 0:00:01 Collecting urllib3<1.23.0,>=1.15 (from kaggle) Downloading https://files.pythonhosted.org/packages/63/cb/6965947c13a94236f6d4b8223e21beb4d576dc72e8130bd7880f600839b8/urllib3-1.22-py2.py3-none-any.whl (132kB) 100% |████████████████████████████████| 133kB 8.3MB/s ta 0:00:01 Requirement not upgraded as not directly required: six>=1.10 in /home/cedric/anaconda3/lib/python3.7/site-packages (from kaggle) (1.11.0) Requirement not upgraded as not directly required: certifi in /home/cedric/anaconda3/lib/python3.7/site-packages (from kaggle) (2018.8.24) Requirement not upgraded as not directly required: python-dateutil in /home/cedric/anaconda3/lib/python3.7/site-packages (from kaggle) (2.7.3) Requirement not upgraded as not directly required: requests in /home/cedric/anaconda3/lib/python3.7/site-packages (from kaggle) (2.19.1) Requirement not upgraded as not directly required: tqdm in /home/cedric/anaconda3/lib/python3.7/site-packages (from kaggle) (4.26.0) Collecting python-slugify (from kaggle) Downloading https://files.pythonhosted.org/packages/00/ad/c778a6df614b6217c30fe80045b365bfa08b5dd3cb02e8b37a6d25126781/python-slugify-1.2.6.tar.gz Requirement not upgraded as not directly required: chardet<3.1.0,>=3.0.2 in /home/cedric/anaconda3/lib/python3.7/site-packages (from requests->kaggle) (3.0.4) Requirement not upgraded as not directly required: idna<2.8,>=2.5 in /home/cedric/anaconda3/lib/python3.7/site-packages (from requests->kaggle) (2.7) Collecting Unidecode>=0.04.16 (from python-slugify->kaggle) Downloading https://files.pythonhosted.org/packages/31/39/53096f9217b057cb049fe872b7fc7ce799a1a89b76cf917d9639e7a558b5/Unidecode-1.0.23-py2.py3-none-any.whl (237kB) 100% |████████████████████████████████| 245kB 35.5MB/s ta 0:00:01 Building wheels for collected packages: kaggle, python-slugify Running setup.py bdist_wheel for kaggle ... done Stored in directory: /home/cedric/.cache/pip/wheels/5a/2d/0c/9fc539e558586b9ed9127916a7f4e620163c24cc97460b1188 Running setup.py bdist_wheel for python-slugify ... done Stored in directory: /home/cedric/.cache/pip/wheels/e3/65/da/2045deea3098ed7471eca0e2460cfbd3fdfe8c1d6fa6fcac92 Successfully built kaggle python-slugify twisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed. Installing collected packages: urllib3, Unidecode, python-slugify, kaggle Found existing installation: urllib3 1.23 Uninstalling urllib3-1.23: Successfully uninstalled urllib3-1.23 Successfully installed Unidecode-1.0.23 kaggle-1.5.1.1 python-slugify-1.2.6 urllib3-1.22 You are using pip version 10.0.1, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.
Then you need to upload your credentials from Kaggle on your instance. Login to kaggle and click on your profile picture on the top left corner, then 'My account'. Scroll down until you find a button named 'Create New API Token' and click on it. This will trigger the download of a file named 'kaggle.json'.
Upload this file to the directory this notebook is running in, by clicking "Upload" on your main Jupyter page, then uncomment and execute the next two commands (or run them in a terminal).
! mkdir -p ~/.kaggle/
! mv kaggle.json ~/.kaggle/
You're all set to download the data from planet competition. You first need to go to its main page and accept its rules, and run the two cells below (uncomment the shell commands to download and unzip the data). If you get a 403 forbidden
error it means you haven't accepted the competition rules yet (you have to go to the competition page, click on Rules tab, and then scroll to the bottom to find the accept button).
path = Config.data_path()/'planet'
path.mkdir(parents=True, exist_ok=True)
path
PosixPath('/home/cedric/.fastai/data/planet')
! kaggle --version
Kaggle API 1.5.1.1
! kaggle competitions download -c planet-understanding-the-amazon-from-space -f train-jpg.tar.7z -p {path}
! kaggle competitions download -c planet-understanding-the-amazon-from-space -f train_v2.csv -p {path}
Downloading train-jpg.tar.7z to /home/cedric/.fastai/data/planet 100%|█████████████████████████████████████████| 600M/600M [00:04<00:00, 140MB/s] Downloading train_v2.csv.zip to /home/cedric/.fastai/data/planet 0%| | 0.00/159k [00:00<?, ?B/s] 100%|████████████████████████████████████████| 159k/159k [00:00<00:00, 70.7MB/s]
! unzip -q -n {path}/train_v2.csv.zip -d {path}
To extract the content of this file, we'll need 7zip, so uncomment the following line if you need to install it (or run sudo apt install p7zip
in your terminal).
! conda install -y -c haasad eidl7zip
Solving environment: done ## Package Plan ## environment location: /home/cedric/anaconda3 added / updated specs: - eidl7zip The following packages will be downloaded: package | build ---------------------------|----------------- eidl7zip-1.0.0 | 1 565 KB haasad The following NEW packages will be INSTALLED: eidl7zip: 1.0.0-1 haasad The following packages will be UPDATED: certifi: 2018.8.24-py37_1 --> 2018.10.15-py37_0 Downloading and Extracting Packages eidl7zip-1.0.0 | 565 KB | ##################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done
And now we can unpack the data (uncomment to run - this might take a few minutes to complete).
! 7za -bd -y -so x {path}/train-jpg.tar.7z | tar xf - -C {path}
!ls {path}/train-jpg | head -n10
train_0.jpg train_1.jpg train_10.jpg train_100.jpg train_1000.jpg train_10000.jpg train_10001.jpg train_10002.jpg train_10003.jpg train_10004.jpg ls: write error: Broken pipe
Contrary to the pets dataset studied in last lesson, here each picture can have multiple labels. If we take a look at the csv file containing the labels (in 'train_v2.csv' here) we see that each 'image_name' is associated to several tags separated by spaces.
df = pd.read_csv(path/'train_v2.csv')
df.head()
image_name | tags | |
---|---|---|
0 | train_0 | haze primary |
1 | train_1 | agriculture clear primary water |
2 | train_2 | clear primary |
3 | train_3 | clear primary |
4 | train_4 | agriculture clear habitation primary road |
To put this in a DataBunch
while using the data block API, we then need to using ImageMultiDataset
(and not ImageClassificationDataset
). This will make sure the model created has the proper loss function to deal with the multiple classes.
# This is a set of transformation which is pretty good for satellite images
tfms = get_transforms(flip_vert=True, max_lighting=0.1, max_zoom=1.05, max_warp=0.)
We use parentheses around the data block pipeline below, so that we can use a multiline statement without needing to add '\'.
np.random.seed(42)
src = (ImageItemList.from_csv(path, 'train_v2.csv', folder='train-jpg', suffix='.jpg')
.random_split_by_pct(0.2)
.label_from_df(sep=' '))
data = (src.transform(tfms, size=128)
.databunch().normalize(imagenet_stats))
show_batch
still works, and show us the different labels separated by ;
.
data.show_batch(rows=3, figsize=(12,9))
To create a Learner
we use the same function as in lesson 1. Our base architecture is resnet50, but the metrics are a little bit different: we use accuracy_thresh
instead of accuracy
. In lesson 1, we determined the predicition for a given class by picking the final activation that was the biggest, but here, each activation can be 0. or 1. accuracy_thresh
selects the ones that are above a certain threshold (0.5 by default) and compares them to the ground truth.
As for Fbeta, it's the metric that was used by Kaggle on this competition. See here for more details.
arch = models.resnet50
acc_02 = partial(accuracy_thresh, thresh=0.2)
f_score = partial(fbeta, thresh=0.2)
learn = create_cnn(data, arch, metrics=[acc_02, f_score])
We use the LR Finder to pick a good learning rate.
learn.lr_find()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
learn.recorder.plot()
Then we can fit the head of our network.
lr = 0.01
learn.fit_one_cycle(5, slice(lr))
epoch | train_loss | valid_loss | accuracy_thresh | fbeta |
---|---|---|---|---|
1 | 0.125452 | 0.108757 | 0.946554 | 0.904610 |
2 | 0.107338 | 0.098813 | 0.955012 | 0.915407 |
3 | 0.102584 | 0.091885 | 0.951488 | 0.918018 |
4 | 0.094727 | 0.088227 | 0.955921 | 0.924755 |
5 | 0.094570 | 0.087069 | 0.957345 | 0.924986 |
learn.save('stage-1-rn50')
...And fine-tune the whole model:
learn.unfreeze()
learn.lr_find()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
learn.recorder.plot()
learn.fit_one_cycle(5, slice(1e-5, lr/5))
epoch | train_loss | valid_loss | accuracy_thresh | fbeta |
---|---|---|---|---|
1 | 0.095748 | 0.092037 | 0.953370 | 0.918665 |
2 | 0.097214 | 0.089237 | 0.954547 | 0.923939 |
3 | 0.093682 | 0.085165 | 0.957773 | 0.926799 |
4 | 0.082624 | 0.083528 | 0.958755 | 0.928822 |
5 | 0.081399 | 0.082552 | 0.958820 | 0.929817 |
learn.save('stage-2-rn50')
learn.load('stage-2-rn50')
Learner(data=ImageDataBunch; Train: LabelList y: MultiCategoryList (32384 items) [MultiCategory haze;primary, MultiCategory clear;primary, MultiCategory clear;primary, MultiCategory haze;primary;water, MultiCategory agriculture;clear;cultivation;primary;water]... Path: /home/cedric/.fastai/data/planet x: ImageItemList (32384 items) [Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256)]... Path: /home/cedric/.fastai/data/planet; Valid: LabelList y: MultiCategoryList (8095 items) [MultiCategory clear;primary;road, MultiCategory clear;primary;water, MultiCategory clear;conventional_mine;habitation;primary;road;water, MultiCategory cloudy, MultiCategory agriculture;clear;cultivation;cultivation;habitation;primary;road;water]... Path: /home/cedric/.fastai/data/planet x: ImageItemList (8095 items) [Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256)]... Path: /home/cedric/.fastai/data/planet; Test: None, model=Sequential( (0): Sequential( (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) (4): Sequential( (0): Bottleneck( (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) (5): Sequential( (0): Bottleneck( (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (3): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) (6): Sequential( (0): Bottleneck( (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (3): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (4): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (5): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) (7): Sequential( (0): Bottleneck( (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) ) (1): Sequential( (0): AdaptiveConcatPool2d( (ap): AdaptiveAvgPool2d(output_size=1) (mp): AdaptiveMaxPool2d(output_size=1) ) (1): Lambda() (2): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (3): Dropout(p=0.25) (4): Linear(in_features=4096, out_features=512, bias=True) (5): ReLU(inplace) (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (7): Dropout(p=0.5) (8): Linear(in_features=512, out_features=17, bias=True) ) ), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=<fastai.layers.FlattenedLoss object at 0x7f4aaf4b6240>, metrics=[functools.partial(<function accuracy_thresh at 0x7f4ac01a19d8>, thresh=0.2), functools.partial(<function fbeta at 0x7f4ac01a1730>, thresh=0.2)], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('/home/cedric/.fastai/data/planet'), model_dir='models', callback_fns=[<class 'fastai.basic_train.Recorder'>], callbacks=[], layer_groups=[Sequential( (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) (4): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (7): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (8): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (9): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (10): ReLU(inplace) (11): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (12): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (13): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (14): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (15): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (16): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (17): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (19): ReLU(inplace) (20): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (21): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (22): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (23): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (24): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (25): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (26): ReLU(inplace) (27): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (28): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (29): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (30): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (31): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (32): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (33): ReLU(inplace) (34): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (35): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (36): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (37): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (38): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (39): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (40): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (42): ReLU(inplace) (43): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (44): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (45): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (46): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (47): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (48): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (49): ReLU(inplace) (50): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (51): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (52): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (53): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (54): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (55): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (56): ReLU(inplace) ), Sequential( (0): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU(inplace) (7): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (8): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (9): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (10): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (11): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (12): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (13): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (14): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (15): ReLU(inplace) (16): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (17): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (18): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (19): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (20): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (21): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (22): ReLU(inplace) (23): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (25): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (26): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (27): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (28): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (29): ReLU(inplace) (30): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (31): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (32): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (33): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (34): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (35): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (36): ReLU(inplace) (37): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (38): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (39): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (40): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (41): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (42): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (43): ReLU(inplace) (44): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (45): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (46): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (48): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (49): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (50): ReLU(inplace) (51): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False) (52): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (53): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (54): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (55): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (56): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (57): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (58): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (59): ReLU(inplace) (60): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (61): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (62): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (63): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (64): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (65): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (66): ReLU(inplace) ), Sequential( (0): AdaptiveAvgPool2d(output_size=1) (1): AdaptiveMaxPool2d(output_size=1) (2): Lambda() (3): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0.25) (5): Linear(in_features=4096, out_features=512, bias=True) (6): ReLU(inplace) (7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (8): Dropout(p=0.5) (9): Linear(in_features=512, out_features=17, bias=True) )])
We've used the image size of 128px in the initial model. That's simply because we want to try it out very quickly.
Now, let's try to use the full size images.
data = (src.transform(tfms, size=256)
.databunch(bs=32).normalize(imagenet_stats))
learn.data = data
data.train_ds[0][0].shape
torch.Size([3, 256, 256])
learn.freeze()
Notice that we are using transfer learning. Instead of training from the beginning, we just start from model we trained with smaller images.
learn.lr_find()
learn.recorder.plot()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
Training Stage 1 - Freeze
lr = 1e-3/2
learn.fit_one_cycle(5, slice(lr))
epoch | train_loss | valid_loss | accuracy_thresh | fbeta |
---|---|---|---|---|
1 | 0.094077 | 0.087534 | 0.955986 | 0.924598 |
2 | 0.091243 | 0.084497 | 0.958427 | 0.928731 |
3 | 0.087307 | 0.084192 | 0.959721 | 0.929067 |
4 | 0.085676 | 0.083820 | 0.959140 | 0.929437 |
5 | 0.084220 | 0.083667 | 0.959351 | 0.929003 |
learn.save('stage-1-256-rn50')
learn.recorder.plot_losses()
learn.recorder.plot_lr()
Training Stage 2 - Unfreeze
learn.unfreeze()
learn.lr_find()
learn.recorder.plot()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
learn.fit_one_cycle(5, slice(1e-5, lr/5))
epoch | train_loss | valid_loss | accuracy_thresh | fbeta |
---|---|---|---|---|
1 | 0.089080 | 0.082964 | 0.959583 | 0.929218 |
2 | 0.084257 | 0.083247 | 0.960295 | 0.929339 |
3 | 0.083025 | 0.081792 | 0.959714 | 0.929984 |
4 | 0.082981 | 0.081631 | 0.959692 | 0.930018 |
5 | 0.077415 | 0.081828 | 0.960324 | 0.929595 |
learn.recorder.plot_losses()
learn.save('stage-2-256-rn50')
You won't really know how you're going until you submit to Kaggle, since the leaderboard isn't using the same subset as we have for training. But as a guide, 50th place (out of 938 teams) on the private leaderboard was a score of 0.930
.
(We'll look at this section later - please don't ask about it just yet! :) )
# ! kaggle competitions download -c planet-understanding-the-amazon-from-space -f test-jpg.tar.7z -p {path}
# ! 7za -bd -y -so x {path}/test-jpg.tar.7z | tar xf - -C {path}
Downloading test-jpg.tar.7z to /home/jhoward/.fastai/data/planet 99%|███████████████████████████████████████▋| 599M/603M [00:11<00:00, 88.6MB/s] 100%|████████████████████████████████████████| 603M/603M [00:11<00:00, 53.2MB/s]
learn.load('stage-2-256-rn50')
Learner(data=ImageDataBunch; Train: LabelList y: MultiCategoryList (32384 items) [MultiCategory haze;primary, MultiCategory clear;primary, MultiCategory clear;primary, MultiCategory haze;primary;water, MultiCategory agriculture;clear;cultivation;primary;water]... Path: /home/cedric/.fastai/data/planet x: ImageItemList (32384 items) [Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256)]... Path: /home/cedric/.fastai/data/planet; Valid: LabelList y: MultiCategoryList (8095 items) [MultiCategory clear;primary;road, MultiCategory clear;primary;water, MultiCategory clear;conventional_mine;habitation;primary;road;water, MultiCategory cloudy, MultiCategory agriculture;clear;cultivation;cultivation;habitation;primary;road;water]... Path: /home/cedric/.fastai/data/planet x: ImageItemList (8095 items) [Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256)]... Path: /home/cedric/.fastai/data/planet; Test: None, model=Sequential( (0): Sequential( (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) (4): Sequential( (0): Bottleneck( (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) (5): Sequential( (0): Bottleneck( (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (3): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) (6): Sequential( (0): Bottleneck( (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (3): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (4): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (5): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) (7): Sequential( (0): Bottleneck( (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) (downsample): Sequential( (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace) ) ) ) (1): Sequential( (0): AdaptiveConcatPool2d( (ap): AdaptiveAvgPool2d(output_size=1) (mp): AdaptiveMaxPool2d(output_size=1) ) (1): Lambda() (2): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (3): Dropout(p=0.25) (4): Linear(in_features=4096, out_features=512, bias=True) (5): ReLU(inplace) (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (7): Dropout(p=0.5) (8): Linear(in_features=512, out_features=17, bias=True) ) ), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=<fastai.layers.FlattenedLoss object at 0x7f649e429c88>, metrics=[functools.partial(<function accuracy_thresh at 0x7f649e8eb9d8>, thresh=0.2), functools.partial(<function fbeta at 0x7f649e8eb730>, thresh=0.2)], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('/home/cedric/.fastai/data/planet'), model_dir='models', callback_fns=[<class 'fastai.basic_train.Recorder'>], callbacks=[], layer_groups=[Sequential( (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) (4): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (7): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (8): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (9): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (10): ReLU(inplace) (11): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (12): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (13): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (14): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (15): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (16): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (17): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (19): ReLU(inplace) (20): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (21): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (22): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (23): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (24): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (25): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (26): ReLU(inplace) (27): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (28): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (29): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (30): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (31): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (32): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (33): ReLU(inplace) (34): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (35): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (36): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (37): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (38): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (39): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (40): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (42): ReLU(inplace) (43): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (44): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (45): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (46): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (47): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (48): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (49): ReLU(inplace) (50): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (51): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (52): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (53): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (54): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (55): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (56): ReLU(inplace) ), Sequential( (0): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU(inplace) (7): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (8): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (9): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (10): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (11): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (12): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (13): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (14): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (15): ReLU(inplace) (16): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (17): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (18): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (19): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (20): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (21): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (22): ReLU(inplace) (23): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (25): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (26): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (27): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (28): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (29): ReLU(inplace) (30): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (31): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (32): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (33): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (34): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (35): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (36): ReLU(inplace) (37): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (38): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (39): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (40): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (41): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (42): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (43): ReLU(inplace) (44): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (45): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (46): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (48): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (49): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (50): ReLU(inplace) (51): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False) (52): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (53): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (54): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (55): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (56): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (57): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (58): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (59): ReLU(inplace) (60): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (61): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (62): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (63): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (64): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (65): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (66): ReLU(inplace) ), Sequential( (0): AdaptiveAvgPool2d(output_size=1) (1): AdaptiveMaxPool2d(output_size=1) (2): Lambda() (3): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0.25) (5): Linear(in_features=4096, out_features=512, bias=True) (6): ReLU(inplace) (7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (8): Dropout(p=0.5) (9): Linear(in_features=512, out_features=17, bias=True) )])
Use Kaggle API to download the test dataset:
! kaggle competitions download -c planet-understanding-the-amazon-from-space -f test-jpg.tar.7z -p {path}
! kaggle competitions download -c planet-understanding-the-amazon-from-space -f test-jpg-additional.tar.7z -p {path}
Downloading test-jpg.tar.7z to /home/cedric/.fastai/data/planet 99%|████████████████████████████████████████▋| 598M/603M [00:04<00:00, 131MB/s] 100%|█████████████████████████████████████████| 603M/603M [00:04<00:00, 142MB/s] Downloading test-jpg-additional.tar.7z to /home/cedric/.fastai/data/planet 98%|████████████████████████████████████████ | 297M/304M [00:02<00:00, 110MB/s] 100%|█████████████████████████████████████████| 304M/304M [00:02<00:00, 115MB/s]
! 7za -bd -y -so x {path}/test-jpg.tar.7z | tar xf - -C {path}
! 7za -bd -y -so x {path}/test-jpg-additional.tar.7z | tar xf - -C {path}
! mv {path}/test-jpg-additional/* {path}/test-jpg
! ls {path}/test-jpg | wc -l
61191
! rm -rf {path}/test-jpg-additional
type(src)
fastai.data_block.LabelLists
learn.data =(src.add_test_folder('test-jpg')
.transform(tfms, size=256)
.databunch(bs=8).normalize(imagenet_stats))
# Sanity check
len(learn.data.train_ds), len(learn.data.valid_ds), len(learn.data.test_ds)
(32384, 8095, 61191)
# Sanity check
len(learn.data.train_dl), len(learn.data.valid_dl), len(learn.data.test_dl)
(4048, 1012, 7649)
# Sanity check
learn.data.test_ds
LabelList y: MultiCategoryList (61191 items) [MultiCategory haze;primary, MultiCategory haze;primary, MultiCategory haze;primary, MultiCategory haze;primary, MultiCategory haze;primary]... Path: /home/cedric/.fastai/data/planet x: ImageItemList (61191 items) [Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256), Image (3, 256, 256)]... Path: /home/cedric/.fastai/data/planet
Applies fastai Test-Time-Augmentation (TTA) to predict on test set:
preds = learn.TTA(ds_type=DatasetType.Test) # TTA brings test time functionality to the Learner class.
torch.save(preds, path/'preds-tta-256-rn50.pt')
Get final predictions:
final_preds = preds[0] # note, preds[1] is y, which is the ground truth/target
final_preds.shape
torch.Size([61191, 17])
# Sanity check
len(final_preds[1])
17
# Sanity check
final_preds[0][0]
tensor(0.8172)
# PS: I have taken these parts of code from Arunoda's notebook.
def find_tags(pred, thresh, show_probs):
classes = ''
for idx, val in enumerate(pred):
if val > thresh:
if show_probs == True:
classes = f'{classes} {learn.data.classes[idx]} ({val})'
else:
classes = f'{classes} {learn.data.classes[idx]}'
return classes.strip()
def predict(f_preds, idx, thresh):
pred_vals = f_preds[idx]
tags = find_tags(pred_vals, thresh, True)
print(tags)
img = learn.data.test_ds[idx][0]
return img
predict(final_preds, 0, 0.2)
agriculture (0.8172218799591064) haze (0.4136252701282501) partly_cloudy (0.37629154324531555) primary (0.9813070297241211)
predict(final_preds, 20, 0.2)
cloudy (0.3825205862522125) haze (0.5826690196990967) primary (0.6680986285209656)
def get_row(f_preds, idx, thresh):
pred = f_preds[idx]
tags = find_tags(pred, thresh, False)
image_path = learn.data.test_ds.x.items[idx]
image_name = re.search(r'([^/]+)$', f'{image_path}')[0].replace('.jpg', '')
return image_name, tags
get_row(final_preds, 0, 0.2)
('file_19658', 'agriculture haze partly_cloudy primary')
get_row(final_preds, 20, 0.2)
('test_18218', 'cloudy haze primary')
Create data frame for Kaggle submission file:
df = pd.DataFrame(columns=['image_name', 'tags'])
for idx in range(len(final_preds)):
if idx % 1000 == 0:
print(f'Progress: {idx}')
image_name, tags = get_row(final_preds, idx, 0.2)
df.loc[idx] = [image_name, tags]
Progress: 0 Progress: 1000 Progress: 2000 Progress: 3000 Progress: 4000 Progress: 5000 Progress: 6000 Progress: 7000 Progress: 8000 Progress: 9000 Progress: 10000 Progress: 11000 Progress: 12000 Progress: 13000 Progress: 14000 Progress: 15000 Progress: 16000 Progress: 17000 Progress: 18000 Progress: 19000 Progress: 20000 Progress: 21000 Progress: 22000 Progress: 23000 Progress: 24000 Progress: 25000 Progress: 26000 Progress: 27000 Progress: 28000 Progress: 29000 Progress: 30000 Progress: 31000 Progress: 32000 Progress: 33000 Progress: 34000 Progress: 35000 Progress: 36000 Progress: 37000 Progress: 38000 Progress: 39000 Progress: 40000 Progress: 41000 Progress: 42000 Progress: 43000 Progress: 44000 Progress: 45000 Progress: 46000 Progress: 47000 Progress: 48000 Progress: 49000 Progress: 50000 Progress: 51000 Progress: 52000 Progress: 53000 Progress: 54000 Progress: 55000 Progress: 56000 Progress: 57000 Progress: 58000 Progress: 59000 Progress: 60000 Progress: 61000
df.head()
filename | tags | |
---|---|---|
0 | file_19658 | agriculture haze partly_cloudy primary |
1 | test_18775 | agriculture bare_ground clear habitation prima... |
2 | file_20453 | agriculture haze primary |
3 | test_23183 | clear primary water |
4 | test_28867 | partly_cloudy primary |
subm_path = path/'subm_fastai_1.0.34_tta_stage2_sz_256_rn50_val_0.2.csv'
df.to_csv(subm_path, index=False)
# Sanity check
! head {subm_path}
image_name,tags file_19658,agriculture haze partly_cloudy primary test_18775,agriculture bare_ground clear habitation primary road file_20453,agriculture haze primary test_23183,clear primary water test_28867,partly_cloudy primary test_17746,clear primary test_11747,agriculture clear primary water test_21382,clear primary test_10914,agriculture clear haze primary road water
Upload submission file to Kaggle
Kaggle allows late submission to check your score. You can use the following command to do that:
! kaggle competitions submit -c planet-understanding-the-amazon-from-space -f {subm_path} -m "fastai: 1.0.34, train: stage2, sz: 256, arch: resnet50, val split: 0.2, TTA"
100%|██████████████████████████████████████| 2.19M/2.19M [00:00<00:00, 7.36MB/s] Successfully submitted to Planet: Understanding the Amazon from Space