Notebook [3]: Training the reader on the SQuAD v1.1 dataset¶

This notebook shows how to fine-tune a pre-trained BERT model on the SQuAD.

*Note:* To run this notebook you will need to have access to GPU. The fine-tuning of the Reader was done with an AWS EC2 p3.2xlarge machine (GPU Tesla V100 16GB). It took about 2 hours to complete (2 epochs on SQuAD 1.1 train was enough to achieve SOTA results on SQuAD 1.1 dev).

In [1]:

import os
import torch
import joblib
from cdqa.reader import BertProcessor, BertQA
from cdqa.utils.download import download_squad

/home/supercalculateur/source/andre/cdqa-dev/env-cdqa/lib/python3.6/site-packages/sklearn/externals/joblib/__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=DeprecationWarning)
/home/supercalculateur/source/andre/cdqa-dev/env-cdqa/lib/python3.6/site-packages/tqdm/autonotebook/__init__.py:18: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)
  " (e.g. in jupyter console)", TqdmExperimentalWarning)

Download SQuAD datasets¶

In [2]:

download_squad(dir='./data')

Downloading SQuAD v1.1 data...
train-v1.1.json already downloaded
dev-v1.1.json already downloaded

Downloading SQuAD v2.0 data...
train-v2.0.json already downloaded
dev-v2.0.json already downloaded

Preprocess SQuAD 1.1 examples¶

In [3]:

train_processor = BertProcessor(do_lower_case=True, is_training=True, n_jobs=-1)
train_examples, train_features = train_processor.fit_transform(X='./data/SQuAD_1.1/train-v1.1.json')

Train the model¶

In [ ]:

reader = BertQA(train_batch_size=12,
                learning_rate=3e-5,
                num_train_epochs=2,
                do_lower_case=True,
                output_dir='models')

reader.fit(X=(train_examples, train_features))

Send model to CPU¶

In [ ]:

reader.model.to('cpu')
reader.device = torch.device('cpu')

Save model locally¶

In [ ]:

joblib.dump(reader, os.path.join(reader.output_dir, 'bert_qa.joblib'))