This tutorial will be divided in 2 parts:
Feel free to use the notebook outline in Jupyter or your IDE for easy navigation.
Programming language classification is the task of identifying which programming language is used in an arbitrary code snippet. This can be useful to label new data to include in a dataset, and potentially serve as an intermediary step when input snippets need to be process based on their programming language.
It is a relatively easy machine learning task given that each programming language has its own formal symbols, syntax, and grammar. However, there are some potential edge cases:
:=
) was a symbol distinctively used in Golang, but was later introduced in Python 3.8.The classification model that will be used in this notebook is CodeBERTa-language-id
by HuggingFace. This model was fine-tuned from the masked language modeling model CodeBERTa-small-v1
trained on the CodeSearchNet
dataset (Husain, 2019).
It supports 6 programming languages:
For this section, we will use the HuggingFace Optimum library, which aims to optimize inference on specific hardware and integrates with the OpenVINO toolkit. The code will be very similar to the HuggingFace Transformers, but will allow to automatically convert models to the OpenVINO™ IR format.
First, complete the repository installation steps.
Then, the following cell will install:
%pip install -q "diffusers>=0.17.1" "openvino>=2023.1.0" "nncf>=2.5.0" "gradio" "onnx>=1.11.0" "transformers>=4.33.0" "evaluate" --extra-index-url https://download.pytorch.org/whl/cpu
%pip install -q "git+https://github.com/huggingface/optimum-intel.git"
The import OVModelForSequenceClassification
from Optimum is equivalent to AutoModelForSequenceClassification
from Transformers
from functools import partial
from pathlib import Path
import pandas as pd
from datasets import load_dataset, Dataset
import evaluate
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
from optimum.intel import OVModelForSequenceClassification
from optimum.intel.openvino import OVConfig, OVQuantizer
from huggingface_hub.utils import RepositoryNotFoundError
2023-08-07 09:42:12.312320: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2023-08-07 09:42:12.350853: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-08-07 09:42:13.079480: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, onnx, openvino
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
MODEL_NAME = "CodeBERTa-language-id"
MODEL_ID = f"huggingface/{MODEL_NAME}"
MODEL_LOCAL_PATH = Path("./model").joinpath(MODEL_NAME)
select device from dropdown list for running inference using OpenVINO
import ipywidgets as widgets
import openvino as ov
core = ov.Core()
device = widgets.Dropdown(
options=core.available_devices + ["AUTO"],
value='AUTO',
description='Device:',
disabled=False,
)
device
Dropdown(description='Device:', index=2, options=('CPU', 'GPU', 'AUTO'), value='AUTO')
# try to load resources locally
try:
model = OVModelForSequenceClassification.from_pretrained(MODEL_LOCAL_PATH, device=device.value)
tokenizer = AutoTokenizer.from_pretrained(MODEL_LOCAL_PATH)
print(f"Loaded resources from local path: {MODEL_LOCAL_PATH.absolute()}")
# if not found, download from HuggingFace Hub then save locally
except (RepositoryNotFoundError, OSError):
print("Downloading resources from HuggingFace Hub")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
tokenizer.save_pretrained(MODEL_LOCAL_PATH)
# export=True is needed to convert the PyTorch model to OpenVINO
model = OVModelForSequenceClassification.from_pretrained(MODEL_ID, export=True, device=device.value)
model.save_pretrained(MODEL_LOCAL_PATH)
print(f"Ressources cached locally at: {MODEL_LOCAL_PATH.absolute()}")
Downloading resources from HuggingFace Hub
Downloading (…)okenizer_config.json: 0%| | 0.00/19.0 [00:00<?, ?B/s]
Downloading (…)lve/main/config.json: 0%| | 0.00/756 [00:00<?, ?B/s]
Downloading (…)olve/main/vocab.json: 0%| | 0.00/994k [00:00<?, ?B/s]
Downloading (…)olve/main/merges.txt: 0%| | 0.00/483k [00:00<?, ?B/s]
Framework not specified. Using pt to export to ONNX.
Downloading pytorch_model.bin: 0%| | 0.00/336M [00:00<?, ?B/s]
Some weights of the model checkpoint at huggingface/CodeBERTa-language-id were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias'] - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Using framework PyTorch: 2.0.1+cpu Overriding 1 configuration item(s) - use_cache -> False
============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ============== verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
Compiling the model... Set CACHE_DIR to /tmp/tmp9yj8fta_/model_cache
Ressources cached locally at: /home/ea/work/openvino_notebooks/notebooks/247-code-language-id/model/CodeBERTa-language-id
code_classification_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers pip install xformers.
# change input snippet to test model
input_snippet = "df['speed'] = df.distance / df.time"
output = code_classification_pipe(input_snippet)
print(f"Input snippet:\n {input_snippet}\n")
print(f"Predicted label: {output[0]['label']}")
print(f"Predicted score: {output[0]['score']:.2}")
Input snippet: df['speed'] = df.distance / df.time Predicted label: python Predicted score: 0.81
In this section, we will quantize a trained model. At a high-level, this process consists of using lower precision numbers in the model, which results in a smaller model size and faster inference at the cost of a potential marginal performance degradation. Learn more.
The HuggingFace Optimum library supports post-training quantization for OpenVINO. Learn more.
QUANTIZED_MODEL_LOCAL_PATH = MODEL_LOCAL_PATH.with_name(f"{MODEL_NAME}-quantized")
DATASET_NAME = "code_search_net"
LABEL_MAPPING = {"go": 0, "java": 1, "javascript": 2, "php": 3, "python": 4, "ruby": 5}
def preprocess_function(examples: dict, tokenizer):
"""Preprocess inputs by tokenizing the `func_code_string` column"""
return tokenizer(
examples["func_code_string"],
padding="max_length",
max_length=tokenizer.model_max_length,
truncation=True,
)
def map_labels(example: dict) -> dict:
"""Convert string labels to integers"""
label_mapping = {"go": 0, "java": 1, "javascript": 2, "php": 3, "python": 4, "ruby": 5}
example["language"] = label_mapping[example["language"]]
return example
def get_dataset_sample(dataset_split: str, num_samples: int) -> Dataset:
"""Create a sample with equal representation of each class without downloading the entire data"""
labels = ["go", "java", "javascript", "php", "python", "ruby"]
example_per_label = num_samples // len(labels)
examples = []
for label in labels:
subset = load_dataset("code_search_net", split=dataset_split, name=label, streaming=True)
subset = subset.map(map_labels)
examples.extend([example for example in subset.shuffle().take(example_per_label)])
return Dataset.from_list(examples)
NOTE: the base model is loaded using AutoModelForSequenceClassification
from Transformers
tokenizer = AutoTokenizer.from_pretrained(MODEL_LOCAL_PATH)
base_model = AutoModelForSequenceClassification.from_pretrained(MODEL_ID)
quantizer = OVQuantizer.from_pretrained(base_model)
quantization_config = OVConfig()
Some weights of the model checkpoint at huggingface/CodeBERTa-language-id were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias'] - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
The get_dataset_sample()
function will sample up to num_samples
, with an equal number of examples across the 6 programming languages.
NOTE: Uncomment the method below to download and use the full dataset (5+ Gb).
calibration_sample = get_dataset_sample(dataset_split="train", num_samples=120)
calibration_sample = calibration_sample.map(partial(preprocess_function, tokenizer=tokenizer))
# calibration_sample = quantizer.get_calibration_dataset(
# DATASET_NAME,
# preprocess_function=partial(preprocess_function, tokenizer=tokenizer),
# num_samples=120,
# dataset_split="train",
# preprocess_batch=True,
# )
Downloading builder script: 0%| | 0.00/8.44k [00:00<?, ?B/s]
Downloading metadata: 0%| | 0.00/18.5k [00:00<?, ?B/s]
Downloading readme: 0%| | 0.00/12.9k [00:00<?, ?B/s]
Map: 0%| | 0/120 [00:00<?, ? examples/s]
Calling quantizer.quantize(...)
will iterate through the calibration dataset to quantize and save the model
quantizer.quantize(
quantization_config=quantization_config,
calibration_dataset=calibration_sample,
save_directory=QUANTIZED_MODEL_LOCAL_PATH,
)
INFO:nncf:Not adding activation input quantizer for operation: 12 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[token_type_embeddings]/embedding_0 INFO:nncf:Not adding activation input quantizer for operation: 11 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[word_embeddings]/embedding_0 INFO:nncf:Not adding activation input quantizer for operation: 3 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/ne_0 INFO:nncf:Not adding activation input quantizer for operation: 4 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/int_0 INFO:nncf:Not adding activation input quantizer for operation: 5 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/cumsum_0 INFO:nncf:Not adding activation input quantizer for operation: 13 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___2 INFO:nncf:Not adding activation input quantizer for operation: 6 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/type_as_0 INFO:nncf:Not adding activation input quantizer for operation: 7 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 8 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__mul___0 INFO:nncf:Not adding activation input quantizer for operation: 9 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/long_0 INFO:nncf:Not adding activation input quantizer for operation: 10 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___1 INFO:nncf:Not adding activation input quantizer for operation: 14 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[position_embeddings]/embedding_0 INFO:nncf:Not adding activation input quantizer for operation: 15 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__iadd___0 INFO:nncf:Not adding activation input quantizer for operation: 16 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 17 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/Dropout[dropout]/dropout_0 INFO:nncf:Not adding activation input quantizer for operation: 30 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 33 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1 INFO:nncf:Not adding activation input quantizer for operation: 39 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 40 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 45 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 46 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 59 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 62 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1 INFO:nncf:Not adding activation input quantizer for operation: 68 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 69 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 74 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 75 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 88 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 91 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1 INFO:nncf:Not adding activation input quantizer for operation: 97 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 98 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 103 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 104 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 117 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 120 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1 INFO:nncf:Not adding activation input quantizer for operation: 126 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 127 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 132 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 133 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 146 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 149 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1 INFO:nncf:Not adding activation input quantizer for operation: 155 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 156 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 161 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 162 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 175 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 178 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1 INFO:nncf:Not adding activation input quantizer for operation: 184 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 185 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Not adding activation input quantizer for operation: 190 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaOutput[output]/__add___0 INFO:nncf:Not adding activation input quantizer for operation: 191 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0 INFO:nncf:Collecting tensor statistics |█ | 33 / 300 INFO:nncf:Collecting tensor statistics |███ | 66 / 300 INFO:nncf:Collecting tensor statistics |█████ | 99 / 300 INFO:nncf:Compiling and loading torch extension: quantized_functions_cpu... huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) INFO:nncf:Finished loading torch extension: quantized_functions_cpu
/home/ea/work/ov_notebooks_env/lib/python3.8/site-packages/nncf/torch/nncf_network.py:938: FutureWarning: Old style of accessing NNCF-specific attributes and methods on NNCFNetwork objects is deprecated. Access the NNCF-specific attrs through the NNCFInterface, which is set up as an `nncf` attribute on the compressed model object. For instance, instead of `compressed_model.get_graph()` you should now write `compressed_model.nncf.get_graph()`. The old style will be removed after NNCF v2.5.0 warning_deprecated( Using framework PyTorch: 2.0.1+cpu Overriding 1 configuration item(s) - use_cache -> False
WARNING:nncf:You are setting `forward` on an NNCF-processed model object. NNCF relies on custom-wrapping the `forward` call in order to function properly. Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour. If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling: model.nncf.set_original_unbound_forward(fn) if `fn` has an unbound 0-th `self` argument, or with model.nncf.temporary_bound_original_forward(fn): ... if `fn` already had 0-th `self` argument bound or never had it in the first place. ============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ============== verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ======================== WARNING:nncf:You are setting `forward` on an NNCF-processed model object. NNCF relies on custom-wrapping the `forward` call in order to function properly. Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour. If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling: model.nncf.set_original_unbound_forward(fn) if `fn` has an unbound 0-th `self` argument, or with model.nncf.temporary_bound_original_forward(fn): ... if `fn` already had 0-th `self` argument bound or never had it in the first place.
Configuration saved in model/CodeBERTa-language-id-quantized/openvino_config.json
NOTE: the argument export=True
is not required since the quantized model is already in the OpenVINO format.
quantized_model = OVModelForSequenceClassification.from_pretrained(QUANTIZED_MODEL_LOCAL_PATH, device=device.value)
quantized_code_classification_pipe = pipeline("text-classification", model=quantized_model, tokenizer=tokenizer)
Compiling the model... Set CACHE_DIR to model/CodeBERTa-language-id-quantized/model_cache
input_snippet = "df['speed'] = df.distance / df.time"
output = quantized_code_classification_pipe(input_snippet)
print(f"Input snippet:\n {input_snippet}\n")
print(f"Predicted label: {output[0]['label']}")
print(f"Predicted score: {output[0]['score']:.2}")
Input snippet: df['speed'] = df.distance / df.time Predicted label: python Predicted score: 0.81
NOTE: Uncomment the method below to download and use the full dataset (5+ Gb).
validation_sample = get_dataset_sample(dataset_split="validation", num_samples=120)
# validation_sample = load_dataset(DATASET_NAME, split="validation")
# This class is needed due to a current limitation of the Evaluate library with multiclass metrics
# ref: https://discuss.huggingface.co/t/combining-metrics-for-multiclass-predictions-evaluations/21792/16
class ConfiguredMetric:
def __init__(self, metric, *metric_args, **metric_kwargs):
self.metric = metric
self.metric_args = metric_args
self.metric_kwargs = metric_kwargs
def add(self, *args, **kwargs):
return self.metric.add(*args, **kwargs)
def add_batch(self, *args, **kwargs):
return self.metric.add_batch(*args, **kwargs)
def compute(self, *args, **kwargs):
return self.metric.compute(*args, *self.metric_args, **kwargs, **self.metric_kwargs)
@property
def name(self):
return self.metric.name
def _feature_names(self):
return self.metric._feature_names()
First, an Evaluator
object for text-classification
and a set of EvaluationModule
are instantiated. Then, the evaluator .compute()
method is called on both the base code_classification_pipe
and the quantized quantized_code_classification_pipeline
. Finally, results are displayed.
code_classification_evaluator = evaluate.evaluator("text-classification")
# instantiate an object that can contain multiple `evaluate` metrics
metrics = evaluate.combine([
ConfiguredMetric(evaluate.load('f1'), average='macro'),
])
base_results = code_classification_evaluator.compute(
model_or_pipeline=code_classification_pipe,
data=validation_sample,
input_column="func_code_string",
label_column="language",
label_mapping=LABEL_MAPPING,
metric=metrics,
)
quantized_results = code_classification_evaluator.compute(
model_or_pipeline=quantized_code_classification_pipe,
data=validation_sample,
input_column="func_code_string",
label_column="language",
label_mapping=LABEL_MAPPING,
metric=metrics,
)
results_df = pd.DataFrame.from_records([base_results, quantized_results], index=["base", "quantized"])
results_df
Downloading builder script: 0%| | 0.00/6.77k [00:00<?, ?B/s]
f1 | total_time_in_seconds | samples_per_second | latency_in_seconds | |
---|---|---|---|---|
base | 1.0 | 2.334411 | 51.404821 | 0.019453 |
quantized | 1.0 | 2.234008 | 53.715113 | 0.018617 |
Uncomment and run cell below to delete all resources cached locally in ./model
# import os
# import shutil
# try:
# shutil.rmtree(path=QUANTIZED_MODEL_LOCAL_PATH)
# shutil.rmtree(path=MODEL_LOCAL_PATH)
# os.remove(path="./compressed_graph.dot")
# os.remove(path="./original_graph.dot")
# except FileNotFoundError:
# print("Directory was already deleted")