This notebook demonstrates how to use chat templates with the SmolLM2
model. Chat templates help structure interactions between users and AI models, ensuring consistent and contextually appropriate responses.
# Install the requirements in Google Colab
# !pip install transformers datasets trl huggingface_hub
# Authenticate to Hugging Face
from huggingface_hub import login
login()
# for convenience you can create an environment variable containing your hub token as HF_TOKEN
VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…
# Import necessary libraries
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import setup_chat_format
import torch
Let's explore how to use a chat template with the SmolLM2
model. We'll define a simple conversation and apply the chat template.
# Dynamically set the device
device = (
"cuda"
if torch.cuda.is_available()
else "mps" if torch.backends.mps.is_available() else "cpu"
)
model_name = "HuggingFaceTB/SmolLM2-135M"
model = AutoModelForCausalLM.from_pretrained(
pretrained_model_name_or_path=model_name
).to(device)
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name)
model, tokenizer = setup_chat_format(model=model, tokenizer=tokenizer)
# Define messages for SmolLM2
messages = [
{"role": "user", "content": "Hello, how are you?"},
{
"role": "assistant",
"content": "I'm doing well, thank you! How can I assist you today?",
},
]
The tokenizer represents the conversation as a string with special tokens to describe the role of the user and the assistant.
input_text = tokenizer.apply_chat_template(messages, tokenize=False)
print("Conversation with template:", input_text)
Conversation with template: <|im_start|>user Hello, how are you?<|im_end|> <|im_start|>assistant I'm doing well, thank you! How can I assist you today?<|im_end|>
Note that the conversation is represented as above but with a further assistant message.
input_text = tokenizer.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True
)
print("Conversation decoded:", tokenizer.decode(token_ids=input_text))
Conversation decoded: <|im_start|>user Hello, how are you?<|im_end|> <|im_start|>assistant I'm doing well, thank you! How can I assist you today?<|im_end|> <|im_start|>assistant
Of course, the tokenizer also tokenizes the conversation and special token as ids that relate to the model's vocabulary.
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
print("Conversation tokenized:", input_text)
Conversation tokenized: [1, 4093, 198, 19556, 28, 638, 359, 346, 47, 2, 198, 1, 520, 9531, 198, 57, 5248, 2567, 876, 28, 9984, 346, 17, 1073, 416, 339, 4237, 346, 1834, 47, 2, 198, 1, 520, 9531, 198]
Take a dataset from the Hugging Face hub and process it for SFT.
Difficulty Levels
🐢 Convert the `HuggingFaceTB/smoltalk` dataset into chatml format.
🐕 Convert the `openai/gsm8k` dataset into chatml format.
from IPython.core.display import display, HTML
display(
HTML(
"""<iframe
src="https://huggingface.co/datasets/HuggingFaceTB/smoltalk/embed/viewer/all/train?row=0"
frameborder="0"
width="100%"
height="360px"
></iframe>
"""
)
)
from datasets import load_dataset
ds = load_dataset("HuggingFaceTB/smoltalk", "everyday-conversations")
def process_dataset(sample):
# TODO: 🐢 Convert the sample into a chat format
# use the tokenizer's method to apply the chat template
return sample
ds = ds.map(process_dataset)
display(
HTML(
"""<iframe
src="https://huggingface.co/datasets/openai/gsm8k/embed/viewer/main/train"
frameborder="0"
width="100%"
height="360px"
></iframe>
"""
)
)
ds = load_dataset("openai/gsm8k", "main")
def process_dataset(sample):
# TODO: 🐕 Convert the sample into a chat format
# 1. create a message format with the role and content
# 2. apply the chat template to the samples using the tokenizer's method
return sample
ds = ds.map(process_dataset)
This notebook demonstrated how to apply chat templates to different models, SmolLM2
. By structuring interactions with chat templates, we can ensure that AI models provide consistent and contextually relevant responses.
In the exercise you tried out converting a dataset into chatml format. Luckily, TRL will do this for you, but it's useful to understand what's going on under the hood.