This notebook regroups the code sample of the video below, which is a part of the Hugging Face course.
#@title
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/d3JVgghSOew?rel=0&controls=0&showinfo=0" frameborder="0" allowfullscreen></iframe>')
Install the Transformers and Datasets libraries to run this notebook.
! pip install datasets transformers[sentencepiece]
from transformers import TFAutoModel
bert_model = TFAutoModel.from_pretrained("bert-base-cased")
print(type(bert_model))
gpt_model = TFAutoModel.from_pretrained("gpt2")
print(type(gpt_model))
bart_model = TFAutoModel.from_pretrained("facebook/bart-base")
print(type(bart_model))
Some layers from the model checkpoint at bert-base-cased were not used when initializing TFBertModel: ['mlm___cls', 'nsp___cls'] - This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). All the layers of TFBertModel were initialized from the model checkpoint at bert-base-cased. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
<class 'transformers.models.bert.modeling_tf_bert.TFBertModel'>
All model checkpoint layers were used when initializing TFGPT2Model. All the layers of TFGPT2Model were initialized from the model checkpoint at gpt2. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2Model for predictions without further training.
<class 'transformers.models.gpt2.modeling_tf_gpt2.TFGPT2Model'>
All model checkpoint layers were used when initializing TFBartModel. All the layers of TFBartModel were initialized from the model checkpoint at facebook/bart-base. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBartModel for predictions without further training.
<class 'transformers.models.bart.modeling_tf_bart.TFBartModel'>
from transformers import AutoConfig
bert_config = AutoConfig.from_pretrained("bert-base-cased")
print(type(bert_config))
gpt_config = AutoConfig.from_pretrained("gpt2")
print(type(gpt_config))
bart_config = AutoConfig.from_pretrained("facebook/bart-base")
print(type(bart_config))
<class 'transformers.models.bert.configuration_bert.BertConfig'> <class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'> <class 'transformers.models.bart.configuration_bart.BartConfig'>
from transformers import BertConfig
bert_config = BertConfig.from_pretrained("bert-base-cased")
print(type(bert_config))
<class 'transformers.models.bert.configuration_bert.BertConfig'>
from transformers import GPT2Config
gpt_config = GPT2Config.from_pretrained("gpt2")
print(type(gpt_config))
<class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'>
from transformers import BartConfig
bart_config = BartConfig.from_pretrained("facebook/bart-base")
print(type(bart_config))
<class 'transformers.models.bart.configuration_bart.BartConfig'>
from transformers import BertConfig
bert_config = BertConfig.from_pretrained("bert-base-cased")
print(bert_config)
BertConfig { "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.7.0.dev0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 28996 }
from transformers import BertConfig, TFBertModel
bert_config = BertConfig.from_pretrained("bert-base-cased")
bert_model = TFBertModel(bert_config)
from transformers import BertConfig, TFBertModel
bert_config = BertConfig.from_pretrained("bert-base-cased")
bert_model = TFBertModel(bert_config)
# Training code
bert_model.save_pretrained("my_bert_model")