This notebook regroups the code sample of the video below, which is a part of the Hugging Face course.

In [ ]:

#@title
from IPython.display import HTML

HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/d3JVgghSOew?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

Out[ ]:

Install the Transformers and Datasets libraries to run this notebook.

In [ ]:

! pip install datasets transformers[sentencepiece]

In [ ]:

from transformers import TFAutoModel

bert_model = TFAutoModel.from_pretrained("bert-base-cased")
print(type(bert_model))

gpt_model = TFAutoModel.from_pretrained("gpt2")
print(type(gpt_model))

bart_model = TFAutoModel.from_pretrained("facebook/bart-base")
print(type(bart_model))

Some layers from the model checkpoint at bert-base-cased were not used when initializing TFBertModel: ['mlm___cls', 'nsp___cls']
- This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertModel were initialized from the model checkpoint at bert-base-cased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.

<class 'transformers.models.bert.modeling_tf_bert.TFBertModel'>

All model checkpoint layers were used when initializing TFGPT2Model.

All the layers of TFGPT2Model were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2Model for predictions without further training.

<class 'transformers.models.gpt2.modeling_tf_gpt2.TFGPT2Model'>

All model checkpoint layers were used when initializing TFBartModel.

All the layers of TFBartModel were initialized from the model checkpoint at facebook/bart-base.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBartModel for predictions without further training.

<class 'transformers.models.bart.modeling_tf_bart.TFBartModel'>

In [ ]:

from transformers import AutoConfig

bert_config = AutoConfig.from_pretrained("bert-base-cased")
print(type(bert_config))

gpt_config = AutoConfig.from_pretrained("gpt2")
print(type(gpt_config))

bart_config = AutoConfig.from_pretrained("facebook/bart-base")
print(type(bart_config))

<class 'transformers.models.bert.configuration_bert.BertConfig'>
<class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'>
<class 'transformers.models.bart.configuration_bart.BartConfig'>

In [ ]:

from transformers import BertConfig

bert_config = BertConfig.from_pretrained("bert-base-cased")
print(type(bert_config))

<class 'transformers.models.bert.configuration_bert.BertConfig'>

In [ ]:

from transformers import GPT2Config

gpt_config = GPT2Config.from_pretrained("gpt2")
print(type(gpt_config))

<class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'>

In [ ]:

from transformers import BartConfig

bart_config = BartConfig.from_pretrained("facebook/bart-base")
print(type(bart_config))

<class 'transformers.models.bart.configuration_bart.BartConfig'>

In [ ]:

from transformers import BertConfig

bert_config = BertConfig.from_pretrained("bert-base-cased")
print(bert_config)

BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.7.0.dev0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 28996
}

In [ ]:

from transformers import BertConfig, TFBertModel

bert_config = BertConfig.from_pretrained("bert-base-cased")
bert_model = TFBertModel(bert_config)

In [ ]:

from transformers import BertConfig, TFBertModel

bert_config = BertConfig.from_pretrained("bert-base-cased")
bert_model = TFBertModel(bert_config)

# Training code

bert_model.save_pretrained("my_bert_model")

In [ ]: