Notebook

Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

Author: Sebastian Raschka
GitHub Repository: https://github.com/rasbt/deeplearning-models

Bidirectional Multi-layer RNN with LSTM with Own Dataset in CSV Format (Amazon Review Polarity)¶

Dataset Description

Amazon Review Polarity Dataset

Version 3, Updated 09/09/2015

ORIGIN

The Amazon reviews dataset consists of reviews from amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. For more information, please refer to the following paper: J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.

The Amazon reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).


DESCRIPTION

The Amazon reviews polarity dataset is constructed by taking review score 1 and 2 as negative, and 4 and 5 as positive. Samples of score 3 is ignored. In the dataset, class 1 is the negative and class 2 is the positive. Each class has 1,800,000 training samples and 200,000 testing samples.

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 or 2), review title and review text. The review title and text are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is "\n".

In [1]:

%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch


import torch
import torch.nn.functional as F
from torchtext import data
from torchtext import datasets
import time
import random
import pandas as pd
import numpy as np

torch.backends.cudnn.deterministic = True

Sebastian Raschka 

CPython 3.7.3
IPython 7.9.0

torch 1.3.0

General Settings¶

In [2]:

RANDOM_SEED = 123
torch.manual_seed(RANDOM_SEED)

VOCABULARY_SIZE = 5000
LEARNING_RATE = 1e-3
BATCH_SIZE = 128
NUM_EPOCHS = 50
DROPOUT = 0.5
DEVICE = torch.device('cuda:2' if torch.cuda.is_available() else 'cpu')

EMBEDDING_DIM = 128
BIDIRECTIONAL = True
HIDDEN_DIM = 256
NUM_LAYERS = 2
OUTPUT_DIM = 2

Dataset¶

The Yelp Review Polarity dataset is available from Xiang Zhang's Google Drive folder at

https://drive.google.com/drive/u/0/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M

From the Google Drive folder, download the file

amazon_review_polarity_csv.tar.gz

In [3]:

!tar xvzf amazon_review_polarity_csv.tar.gz

amazon_review_polarity_csv/
amazon_review_polarity_csv/test.csv
amazon_review_polarity_csv/train.csv
amazon_review_polarity_csv/readme.txt

Check that the dataset looks okay:

In [4]:

df = pd.read_csv('amazon_review_polarity_csv/train.csv', header=None, index_col=None)
df.columns = ['classlabel', 'title', 'content']
df['classlabel'] = df['classlabel']-1
df.head()

Out[4]:

	classlabel	title	content
0	1	Stuning even for the non-gamer	This sound track was beautiful! It paints the ...
1	1	The best soundtrack ever to anything.	I'm reading a lot of reviews saying that this ...
2	1	Amazing!	This soundtrack is my favorite music of all ti...
3	1	Excellent Soundtrack	I truly like this soundtrack and I enjoy video...
4	1	Remember, Pull Your Jaw Off The Floor After He...	If you've played the game, you know how divine...

In [5]:

np.unique(df['classlabel'].values)

Out[5]:

array([0, 1])

In [6]:

np.bincount(df['classlabel'])

Out[6]:

array([1800000, 1800000])

In [7]:

df[['classlabel', 'content']].to_csv('amazon_review_polarity_csv/train_prepocessed.csv', index=None)

In [8]:

df = pd.read_csv('amazon_review_polarity_csv/test.csv', header=None, index_col=None)
df.columns = ['classlabel', 'title', 'content']
df['classlabel'] = df['classlabel']-1
df.head()

Out[8]:

	classlabel	title	content
0	1	Great CD	My lovely Pat has one of the GREAT voices of h...
1	1	One of the best game music soundtracks - for a...	Despite the fact that I have only played a sma...
2	0	Batteries died within a year ...	I bought this charger in Jul 2003 and it worke...
3	1	works fine, but Maha Energy is better	Check out Maha Energy's website. Their Powerex...
4	1	Great for the non-audiophile	Reviewed quite a bit of the combo players and ...

In [9]:

np.unique(df['classlabel'].values)

Out[9]:

array([0, 1])

In [10]:

np.bincount(df['classlabel'])

Out[10]:

array([200000, 200000])

In [11]:

df[['classlabel', 'content']].to_csv('amazon_review_polarity_csv/test_prepocessed.csv', index=None)

In [12]:

del df

Define the Label and Text field formatters:

In [13]:

TEXT = data.Field(sequential=True,
                  tokenize='spacy',
                  include_lengths=True) # necessary for packed_padded_sequence

LABEL = data.LabelField(dtype=torch.float)


# If you get an error [E050] Can't find model 'en'
# you need to run the following on your command line:
#  python -m spacy download en

Process the dataset:

In [14]:

fields = [('classlabel', LABEL), ('content', TEXT)]

train_dataset = data.TabularDataset(
    path="amazon_review_polarity_csv/train_prepocessed.csv", format='csv',
    skip_header=True, fields=fields)

test_dataset = data.TabularDataset(
    path="amazon_review_polarity_csv/test_prepocessed.csv", format='csv',
    skip_header=True, fields=fields)

Split the training dataset into training and validation:

In [15]:

train_data, valid_data = train_dataset.split(
    split_ratio=[0.95, 0.05],
    random_state=random.seed(RANDOM_SEED))

print(f'Num Train: {len(train_data)}')
print(f'Num Valid: {len(valid_data)}')

Num Train: 3420000
Num Valid: 180000

Build the vocabulary based on the top "VOCABULARY_SIZE" words:

In [16]:

TEXT.build_vocab(train_data,
                 max_size=VOCABULARY_SIZE,
                 vectors='glove.6B.100d',
                 unk_init=torch.Tensor.normal_)

LABEL.build_vocab(train_data)

print(f'Vocabulary size: {len(TEXT.vocab)}')
print(f'Number of classes: {len(LABEL.vocab)}')

Vocabulary size: 5002
Number of classes: 2

In [17]:

list(LABEL.vocab.freqs)[-10:]

Out[17]:

['1', '0']

The TEXT.vocab dictionary will contain the word counts and indices. The reason why the number of words is VOCABULARY_SIZE + 2 is that it contains to special tokens for padding and unknown words: <unk> and <pad>.

Make dataset iterators:

In [18]:

train_loader, valid_loader, test_loader = data.BucketIterator.splits(
    (train_data, valid_data, test_dataset), 
    batch_size=BATCH_SIZE,
    sort_within_batch=True, # necessary for packed_padded_sequence
    sort_key=lambda x: len(x.content),
    device=DEVICE)

Testing the iterators (note that the number of rows depends on the longest document in the respective batch):

In [19]:

print('Train')
for batch in train_loader:
    print(f'Text matrix size: {batch.content[0].size()}')
    print(f'Target vector size: {batch.classlabel.size()}')
    break
    
print('\nValid:')
for batch in valid_loader:
    print(f'Text matrix size: {batch.content[0].size()}')
    print(f'Target vector size: {batch.classlabel.size()}')
    break
    
print('\nTest:')
for batch in test_loader:
    print(f'Text matrix size: {batch.content[0].size()}')
    print(f'Target vector size: {batch.classlabel.size()}')
    break

Train
Text matrix size: torch.Size([74, 128])
Target vector size: torch.Size([128])

Valid:
Text matrix size: torch.Size([14, 128])
Target vector size: torch.Size([128])

Test:
Text matrix size: torch.Size([12, 128])
Target vector size: torch.Size([128])

Model¶

In [20]:

import torch.nn as nn


class RNN(nn.Module):
    def __init__(self, input_dim, embedding_dim, bidirectional, hidden_dim, num_layers, output_dim, dropout, pad_idx):
        
        super().__init__()
        
        self.embedding = nn.Embedding(input_dim, embedding_dim, padding_idx=pad_idx)
        self.rnn = nn.LSTM(embedding_dim, 
                           hidden_dim,
                           num_layers=num_layers,
                           bidirectional=bidirectional, 
                           dropout=dropout)
        self.fc1 = nn.Linear(hidden_dim * num_layers, 64)
        self.fc2 = nn.Linear(64, output_dim)
        self.dropout = nn.Dropout(dropout)
        
    def forward(self, text, text_length):

        embedded = self.dropout(self.embedding(text))
        packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_length)
        packed_output, (hidden, cell) = self.rnn(packed_embedded)
        output, output_lengths = nn.utils.rnn.pad_packed_sequence(packed_output)
        hidden = self.dropout(torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim=1))
        hidden = self.fc1(hidden)
        return hidden

In [21]:

INPUT_DIM = len(TEXT.vocab)

PAD_IDX = TEXT.vocab.stoi[TEXT.pad_token]

torch.manual_seed(RANDOM_SEED)
model = RNN(INPUT_DIM, EMBEDDING_DIM, BIDIRECTIONAL, HIDDEN_DIM, NUM_LAYERS, OUTPUT_DIM, DROPOUT, PAD_IDX)
model = model.to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

Training¶

In [22]:

def compute_accuracy(model, data_loader, device):
    model.eval()
    correct_pred, num_examples = 0, 0
    with torch.no_grad():
        for batch_idx, batch_data in enumerate(data_loader):
            text, text_lengths = batch_data.content
            logits = model(text, text_lengths).squeeze(1)
            _, predicted_labels = torch.max(logits, 1)
            num_examples += batch_data.classlabel.size(0)
            correct_pred += (predicted_labels.long() == batch_data.classlabel.long()).sum()
        return correct_pred.float()/num_examples * 100

In [23]:

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    model.train()
    for batch_idx, batch_data in enumerate(train_loader):
        
        text, text_lengths = batch_data.content
        
        ### FORWARD AND BACK PROP
        logits = model(text, text_lengths).squeeze(1)
        cost = F.cross_entropy(logits, batch_data.classlabel.long())
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 10000:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} | '
                   f'Cost: {cost:.4f}')

    with torch.set_grad_enabled(False):
        print(f'training accuracy: '
              f'{compute_accuracy(model, train_loader, DEVICE):.2f}%'
              f'\nvalid accuracy: '
              f'{compute_accuracy(model, valid_loader, DEVICE):.2f}%')
        
    print(f'Time elapsed: {(time.time() - start_time)/60:.2f} min')
    
print(f'Total Training Time: {(time.time() - start_time)/60:.2f} min')
print(f'Test accuracy: {compute_accuracy(model, test_loader, DEVICE):.2f}%')

Epoch: 001/050 | Batch 000/26719 | Cost: 4.1805
Epoch: 001/050 | Batch 10000/26719 | Cost: 0.2005
Epoch: 001/050 | Batch 20000/26719 | Cost: 0.1998
training accuracy: 93.34%
valid accuracy: 93.27%
Time elapsed: 33.40 min
Epoch: 002/050 | Batch 000/26719 | Cost: 0.1659
Epoch: 002/050 | Batch 10000/26719 | Cost: 0.1326
Epoch: 002/050 | Batch 20000/26719 | Cost: 0.1470
training accuracy: 93.82%
valid accuracy: 93.63%
Time elapsed: 66.69 min
Epoch: 003/050 | Batch 000/26719 | Cost: 0.1256
Epoch: 003/050 | Batch 10000/26719 | Cost: 0.1980
Epoch: 003/050 | Batch 20000/26719 | Cost: 0.2041
training accuracy: 93.98%
valid accuracy: 93.82%
Time elapsed: 100.02 min
Epoch: 004/050 | Batch 000/26719 | Cost: 0.2103
Epoch: 004/050 | Batch 10000/26719 | Cost: 0.1100
Epoch: 004/050 | Batch 20000/26719 | Cost: 0.1851
training accuracy: 94.11%
valid accuracy: 93.93%
Time elapsed: 133.32 min
Epoch: 005/050 | Batch 000/26719 | Cost: 0.2196
Epoch: 005/050 | Batch 10000/26719 | Cost: 0.1209
Epoch: 005/050 | Batch 20000/26719 | Cost: 0.2147
training accuracy: 94.13%
valid accuracy: 93.93%
Time elapsed: 166.67 min
Epoch: 006/050 | Batch 000/26719 | Cost: 0.1908
Epoch: 006/050 | Batch 10000/26719 | Cost: 0.2187
Epoch: 006/050 | Batch 20000/26719 | Cost: 0.2253
training accuracy: 94.15%
valid accuracy: 93.93%
Time elapsed: 199.87 min
Epoch: 007/050 | Batch 000/26719 | Cost: 0.1990
Epoch: 007/050 | Batch 10000/26719 | Cost: 0.1928
Epoch: 007/050 | Batch 20000/26719 | Cost: 0.2113
training accuracy: 94.21%
valid accuracy: 93.97%
Time elapsed: 233.25 min
Epoch: 008/050 | Batch 000/26719 | Cost: 0.1753
Epoch: 008/050 | Batch 10000/26719 | Cost: 0.1708
Epoch: 008/050 | Batch 20000/26719 | Cost: 0.2158
training accuracy: 94.21%
valid accuracy: 93.97%
Time elapsed: 266.51 min
Epoch: 009/050 | Batch 000/26719 | Cost: 0.2423
Epoch: 009/050 | Batch 10000/26719 | Cost: 0.1097
Epoch: 009/050 | Batch 20000/26719 | Cost: 0.1727
training accuracy: 94.18%
valid accuracy: 93.98%
Time elapsed: 299.86 min
Epoch: 010/050 | Batch 000/26719 | Cost: 0.1474
Epoch: 010/050 | Batch 10000/26719 | Cost: 0.2041
Epoch: 010/050 | Batch 20000/26719 | Cost: 0.1127
training accuracy: 94.13%
valid accuracy: 93.91%
Time elapsed: 333.10 min
Epoch: 011/050 | Batch 000/26719 | Cost: 0.1643
Epoch: 011/050 | Batch 10000/26719 | Cost: 0.1772
Epoch: 011/050 | Batch 20000/26719 | Cost: 0.1586
training accuracy: 94.13%
valid accuracy: 93.92%
Time elapsed: 366.48 min
Epoch: 012/050 | Batch 000/26719 | Cost: 0.1335
Epoch: 012/050 | Batch 10000/26719 | Cost: 0.1680
Epoch: 012/050 | Batch 20000/26719 | Cost: 0.1775
training accuracy: 94.04%
valid accuracy: 93.80%
Time elapsed: 399.85 min
Epoch: 013/050 | Batch 000/26719 | Cost: 0.1896
Epoch: 013/050 | Batch 10000/26719 | Cost: 0.0957
Epoch: 013/050 | Batch 20000/26719 | Cost: 0.1700
training accuracy: 94.02%
valid accuracy: 93.80%
Time elapsed: 432.30 min
Epoch: 014/050 | Batch 000/26719 | Cost: 0.1370
Epoch: 014/050 | Batch 10000/26719 | Cost: 0.1449
Epoch: 014/050 | Batch 20000/26719 | Cost: 0.1874
training accuracy: 93.96%
valid accuracy: 93.80%
Time elapsed: 463.91 min
Epoch: 015/050 | Batch 000/26719 | Cost: 0.1289
Epoch: 015/050 | Batch 10000/26719 | Cost: 0.1852
Epoch: 015/050 | Batch 20000/26719 | Cost: 0.1166
training accuracy: 93.79%
valid accuracy: 93.64%
Time elapsed: 495.59 min
Epoch: 016/050 | Batch 000/26719 | Cost: 0.1109
Epoch: 016/050 | Batch 10000/26719 | Cost: 0.1259
Epoch: 016/050 | Batch 20000/26719 | Cost: 0.1309
training accuracy: 93.75%
valid accuracy: 93.58%
Time elapsed: 527.20 min
Epoch: 017/050 | Batch 000/26719 | Cost: 0.2273
Epoch: 017/050 | Batch 10000/26719 | Cost: 0.1037
Epoch: 017/050 | Batch 20000/26719 | Cost: 0.1274
training accuracy: 93.58%
valid accuracy: 93.43%
Time elapsed: 558.80 min
Epoch: 018/050 | Batch 000/26719 | Cost: 0.1924
Epoch: 018/050 | Batch 10000/26719 | Cost: 0.1870
Epoch: 018/050 | Batch 20000/26719 | Cost: 0.2183
training accuracy: 93.61%
valid accuracy: 93.51%
Time elapsed: 590.48 min
Epoch: 019/050 | Batch 000/26719 | Cost: 0.1955
Epoch: 019/050 | Batch 10000/26719 | Cost: 0.1745
Epoch: 019/050 | Batch 20000/26719 | Cost: 0.1339
training accuracy: 93.49%
valid accuracy: 93.43%
Time elapsed: 622.06 min
Epoch: 020/050 | Batch 000/26719 | Cost: 0.1498
Epoch: 020/050 | Batch 10000/26719 | Cost: 0.2582
Epoch: 020/050 | Batch 20000/26719 | Cost: 0.2263
training accuracy: 93.41%
valid accuracy: 93.32%
Time elapsed: 653.69 min
Epoch: 021/050 | Batch 000/26719 | Cost: 0.2266
Epoch: 021/050 | Batch 10000/26719 | Cost: 0.1824
Epoch: 021/050 | Batch 20000/26719 | Cost: 0.2128
training accuracy: 93.32%
valid accuracy: 93.18%
Time elapsed: 685.43 min
Epoch: 022/050 | Batch 000/26719 | Cost: 0.1637
Epoch: 022/050 | Batch 10000/26719 | Cost: 0.2462
Epoch: 022/050 | Batch 20000/26719 | Cost: 0.1890
training accuracy: 93.24%
valid accuracy: 93.13%
Time elapsed: 716.98 min
Epoch: 023/050 | Batch 000/26719 | Cost: 0.2072
Epoch: 023/050 | Batch 10000/26719 | Cost: 0.1904
Epoch: 023/050 | Batch 20000/26719 | Cost: 0.2408
training accuracy: 93.13%
valid accuracy: 93.02%
Time elapsed: 748.55 min
Epoch: 024/050 | Batch 000/26719 | Cost: 0.1655
Epoch: 024/050 | Batch 10000/26719 | Cost: 0.2909
Epoch: 024/050 | Batch 20000/26719 | Cost: 0.1979
training accuracy: 93.05%
valid accuracy: 92.97%
Time elapsed: 780.21 min
Epoch: 025/050 | Batch 000/26719 | Cost: 0.1742
Epoch: 025/050 | Batch 10000/26719 | Cost: 0.2666
Epoch: 025/050 | Batch 20000/26719 | Cost: 0.2489
training accuracy: 92.97%
valid accuracy: 92.84%
Time elapsed: 811.86 min
Epoch: 026/050 | Batch 000/26719 | Cost: 0.2000
Epoch: 026/050 | Batch 10000/26719 | Cost: 0.1438
Epoch: 026/050 | Batch 20000/26719 | Cost: 0.1771
training accuracy: 92.80%
valid accuracy: 92.69%
Time elapsed: 843.59 min
Epoch: 027/050 | Batch 000/26719 | Cost: 0.1902
Epoch: 027/050 | Batch 10000/26719 | Cost: 0.1842
Epoch: 027/050 | Batch 20000/26719 | Cost: 0.2043
training accuracy: 92.93%
valid accuracy: 92.85%
Time elapsed: 875.26 min
Epoch: 028/050 | Batch 000/26719 | Cost: 0.1836
Epoch: 028/050 | Batch 10000/26719 | Cost: 0.1861
Epoch: 028/050 | Batch 20000/26719 | Cost: 0.1953
training accuracy: 92.85%
valid accuracy: 92.76%
Time elapsed: 906.92 min
Epoch: 029/050 | Batch 000/26719 | Cost: 0.2089
Epoch: 029/050 | Batch 10000/26719 | Cost: 0.2378
Epoch: 029/050 | Batch 20000/26719 | Cost: 0.1476
training accuracy: 92.84%
valid accuracy: 92.74%
Time elapsed: 938.51 min
Epoch: 030/050 | Batch 000/26719 | Cost: 0.1816
Epoch: 030/050 | Batch 10000/26719 | Cost: 0.2420
Epoch: 030/050 | Batch 20000/26719 | Cost: 0.1891
training accuracy: 92.73%
valid accuracy: 92.63%
Time elapsed: 970.14 min
Epoch: 031/050 | Batch 000/26719 | Cost: 0.1959
Epoch: 031/050 | Batch 10000/26719 | Cost: 0.2809
Epoch: 031/050 | Batch 20000/26719 | Cost: 0.2692
training accuracy: 92.65%
valid accuracy: 92.63%
Time elapsed: 1001.72 min
Epoch: 032/050 | Batch 000/26719 | Cost: 0.1845
Epoch: 032/050 | Batch 10000/26719 | Cost: 0.2390
Epoch: 032/050 | Batch 20000/26719 | Cost: 0.1673
training accuracy: 92.54%
valid accuracy: 92.50%
Time elapsed: 1033.34 min
Epoch: 033/050 | Batch 000/26719 | Cost: 0.1612
Epoch: 033/050 | Batch 10000/26719 | Cost: 0.2473
Epoch: 033/050 | Batch 20000/26719 | Cost: 0.2368
training accuracy: 92.52%
valid accuracy: 92.43%
Time elapsed: 1064.98 min
Epoch: 034/050 | Batch 000/26719 | Cost: 0.1739
Epoch: 034/050 | Batch 10000/26719 | Cost: 0.2465
Epoch: 034/050 | Batch 20000/26719 | Cost: 0.2751
training accuracy: 92.43%
valid accuracy: 92.35%
Time elapsed: 1096.60 min
Epoch: 035/050 | Batch 000/26719 | Cost: 0.1641
Epoch: 035/050 | Batch 10000/26719 | Cost: 0.2993
Epoch: 035/050 | Batch 20000/26719 | Cost: 0.2110
training accuracy: 92.44%
valid accuracy: 92.38%
Time elapsed: 1128.23 min
Epoch: 036/050 | Batch 000/26719 | Cost: 0.1998
Epoch: 036/050 | Batch 10000/26719 | Cost: 0.4061
Epoch: 036/050 | Batch 20000/26719 | Cost: 0.3348
training accuracy: 92.34%
valid accuracy: 92.23%
Time elapsed: 1159.86 min
Epoch: 037/050 | Batch 000/26719 | Cost: 0.2720
Epoch: 037/050 | Batch 10000/26719 | Cost: 0.1884
Epoch: 037/050 | Batch 20000/26719 | Cost: 0.2429
training accuracy: 92.38%
valid accuracy: 92.35%
Time elapsed: 1191.48 min
Epoch: 038/050 | Batch 000/26719 | Cost: 0.1869
Epoch: 038/050 | Batch 10000/26719 | Cost: 0.3093
Epoch: 038/050 | Batch 20000/26719 | Cost: 0.2258
training accuracy: 92.32%
valid accuracy: 92.33%
Time elapsed: 1223.13 min
Epoch: 039/050 | Batch 000/26719 | Cost: 0.2780
Epoch: 039/050 | Batch 10000/26719 | Cost: 0.2481
Epoch: 039/050 | Batch 20000/26719 | Cost: 0.2593
training accuracy: 92.34%
valid accuracy: 92.31%
Time elapsed: 1254.79 min
Epoch: 040/050 | Batch 000/26719 | Cost: 0.1992
Epoch: 040/050 | Batch 10000/26719 | Cost: 0.2254
Epoch: 040/050 | Batch 20000/26719 | Cost: 0.2145
training accuracy: 92.31%
valid accuracy: 92.25%
Time elapsed: 1286.39 min
Epoch: 041/050 | Batch 000/26719 | Cost: 0.1949
Epoch: 041/050 | Batch 10000/26719 | Cost: 0.2056
Epoch: 041/050 | Batch 20000/26719 | Cost: 0.2562
training accuracy: 92.15%
valid accuracy: 92.10%
Time elapsed: 1318.01 min
Epoch: 042/050 | Batch 000/26719 | Cost: 0.2261
Epoch: 042/050 | Batch 10000/26719 | Cost: 0.2665
Epoch: 042/050 | Batch 20000/26719 | Cost: 0.2810
training accuracy: 91.95%
valid accuracy: 91.88%
Time elapsed: 1349.75 min
Epoch: 043/050 | Batch 000/26719 | Cost: 0.2078
Epoch: 043/050 | Batch 10000/26719 | Cost: 0.2598
Epoch: 043/050 | Batch 20000/26719 | Cost: 0.2550
training accuracy: 92.00%
valid accuracy: 91.96%
Time elapsed: 1381.34 min
Epoch: 044/050 | Batch 000/26719 | Cost: 0.1947
Epoch: 044/050 | Batch 10000/26719 | Cost: 0.2332
Epoch: 044/050 | Batch 20000/26719 | Cost: 0.3156
training accuracy: 91.84%
valid accuracy: 91.83%
Time elapsed: 1412.81 min
Epoch: 045/050 | Batch 000/26719 | Cost: 0.2643
Epoch: 045/050 | Batch 10000/26719 | Cost: 0.2745
Epoch: 045/050 | Batch 20000/26719 | Cost: 0.3741
training accuracy: 91.98%
valid accuracy: 91.94%
Time elapsed: 1444.41 min
Epoch: 046/050 | Batch 000/26719 | Cost: 0.2029
Epoch: 046/050 | Batch 10000/26719 | Cost: 0.2028
Epoch: 046/050 | Batch 20000/26719 | Cost: 0.2525
training accuracy: 91.84%
valid accuracy: 91.86%
Time elapsed: 1476.07 min
Epoch: 047/050 | Batch 000/26719 | Cost: 0.2104
Epoch: 047/050 | Batch 10000/26719 | Cost: 0.1793
Epoch: 047/050 | Batch 20000/26719 | Cost: 0.2022
training accuracy: 91.75%
valid accuracy: 91.73%
Time elapsed: 1507.73 min
Epoch: 048/050 | Batch 000/26719 | Cost: 0.3482
Epoch: 048/050 | Batch 10000/26719 | Cost: 0.2211
Epoch: 048/050 | Batch 20000/26719 | Cost: 0.2857
training accuracy: 91.62%
valid accuracy: 91.56%
Time elapsed: 1539.42 min
Epoch: 049/050 | Batch 000/26719 | Cost: 0.2514
Epoch: 049/050 | Batch 10000/26719 | Cost: 0.2387
Epoch: 049/050 | Batch 20000/26719 | Cost: 0.2515
training accuracy: 91.54%
valid accuracy: 91.47%
Time elapsed: 1571.06 min
Epoch: 050/050 | Batch 000/26719 | Cost: 0.2802
Epoch: 050/050 | Batch 10000/26719 | Cost: 0.3489
Epoch: 050/050 | Batch 20000/26719 | Cost: 0.2609
training accuracy: 91.49%
valid accuracy: 91.40%
Time elapsed: 1602.62 min
Total Training Time: 1602.62 min
Test accuracy: 91.36%

In [26]:

%watermark -iv

spacy     2.2.3
pandas    0.24.2
torchtext 0.4.0
numpy     1.17.2
torch     1.3.0

In [27]:

torch.save(model.state_dict(), 'rnn_bi_multilayer_lstm_own_csv_amazon-polarity.pt')