As discussed in the previous segment, Hugging face offers the Transformer library API to democratize the usage of the state-of-the-art NLP models. The most significant advantage of such an API is the abstraction it provides, allowing developers to use this library quickly.
Pipelines are great and easy way to use all types of models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including named entity recognition, masked language modelling, sentiment analysis, feature extraction, and question answering.
The pipeline() function is the most powerful function offered by the API. It encapsulates all other pipelines and handles everything, including converting raw text into a set of predictions from a fine-tuned model.
You can download the notebook used in this segment from here.
In the upcoming video, you will learn more about the pipeline() function.
As explained in the video above, the pipeline() function allows you to perform different tasks. Here, the function is used for text-generation.
from transformers import pipeline.
Example
generated = pipeline("text-generation") generator("In the galaxy far far")Output
[{generated_text':'In the galaxy far far, far away in the far future, a strange, strange and beautiful force is taking over the galaxy.\n\nIt does not know how this is happening." \n\n- Roddenberry, The Hitch}]The pipeline() function allows you to easily download any pre-trained model and perform any NLP task in just two lines of code.
By default, the code above downloads the gpt2 model. However, you may change the model as per your requirement.
Example
generated = pipeline("text-generation", model="distilgpt2") generator( "In the galaxy far far", max_lenght=30, num_return_sequences=2,Output
[{'generated_text': 'In the galaxy far far away, which mean a few million years ago. Now it’s looking at how much it can melt - maybe'}, {'generated_text': "In the galaxy far far, far away, it’s very familiar to the ancient history"}]Here, we are using the distilGpt2 model, which is a lighter variant of the gpt2 model.
Also, you can pass custom arguments to control the output of the code given above. Here, max_length denotes the maximum length of the generated text, and num_return_sequences denotes the number of output sequences the model should return.
Apart from text generation, you can do masked language modelling, where the downloaded model will predict masked words.
Example
unmasker = pipeline("fill-mask") unmasker("You are going to <mask> about a wonderful library today.", top_k=2)Output
As the code above asks to make the top two predictions for the <mask> token, the model provides the following output in a dictionary, with their prediction score.
[{'score': 0.5679107308387756, 'token': 1798, 'token_str': ' hear', 'sequence': 'You are going to hear about a wonderful library today.'}, {'score': 0.22818315029144287, 'token': 1532, 'token_str': ' learn', 'sequence': 'You are going to learn about a wonderful library today.'}] wonderful library today.'}]Here, the model has predicted “hear” and “learn” as the top two predictions for the masked word.
Let’s see what happens behind the scene as you use the pipeline() function.
Let’s take a simple example of sentiment classification:
Example
input_sentences = [ "I don't like this movie", "Upgrad is helping me learn new and wonderful things.", ] classifier = pipeline("sentiment-analysis") classifier( input_sentences ) Output
[{'label': 'NEGATIVE', 'score': 0.9839025139808655}, {'label': 'POSITIVE', 'score': 0.9998325109481812}]As you may observe, the model has returned NEGATIVE sentiment for the first sentence and positive sentiment for the second sentence.
The pipeline() function encapsulates the pre-processing, modelling and post-processing steps. Therefore, it can convert the input sentence into predictions easily.
Let’s take a look at these steps.
As you saw earlier, the pipeline() function uses the tokenizer function function first to convert the raw text into itsnumerical representation. However, each model uses different tokenization techniques, which you may not be aware of. This is where the AutoTokenizer.from_pretrained() method comes into the picture. It automatically identifies the relevant tokenization technique pertaining to the specified model using the checkpoint name of the model.
The AutoTokenizer.from_pretrained() method can fetch the data associated with the model’s tokenizer and cache it for re-use.
from transformers import AutoTokenizer model = “distilbert-base-uncased-finetuned-sst-2-english” tokenizer = AutoTokenizer.from_pretrained(model)
Once the tokenizer function is initialized with the checkpoint/model name, you can feed your input sentence to pre-process it as per the model’s input expectations.
Example
inputs = tokenizer(input_sentences, padding=True, truncation=True, max_length =12, return_tensors="tf",) pp.pprint(inputs)Output
{'attention_mask': <tf.Tensor: shape=(2, 12), dtype=int32, numpy= array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int32)>, 'input_ids': <tf.Tensor: shape=(2, 12), dtype=int32, numpy= array([[ 101, 1045, 2123, 1005, 1056, 2066, 2023, 3185, 102, 0, 0, 0], [ 101, 2039, 16307, 2003, 5094, 2033, 4553, 2047, 1998, 6919, 2477, 102]], dtype=int32)>}The returned output is in the form of a dictionary containing the following keys: ‘attention_mask’ and ‘input_ids’.
The last three 0’s [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0] in “input_ids” signifies the padded values as padding=True and max_length = 12.
Once the raw input is pre-processed, we can feed it to the model to the predictions. We can download the model the same way as we have downloaded the tokenizers. To download it, the Tranformers API provides the TFAutoModel class with the from_pretrained method.
In this case, we will use TFAutoModelForSequenceClassification to download a distilbert model with a sequence classification head.
Example
from transformers import TFAutoModelForSequenceClassification checkpoint = "distilbert-base-uncased-finetuned-sst-2-english" model = TFAutoModelForSequenceClassification.from_pretrained(model) outputs = model(inputs) pp.pprint(outputs.logits.shape) Output
After the model produces the output, you may observe the shape of the output.
TensorShape([2, 2]) Here, the row signifies the number of input sentences fed to it and the column signifies the number of labels at the output.
You can also visualize the prediction score using:
Example
pp.pprint(outputs.logits)Output
<tf.Tensor: shape=(2, 2), dtype=float32, numpy= array([[ 2.2426074, -1.870255 ], [-4.1284227, 4.432841 ]], dtype=float32)>This output can be normalized using a softmax function.
Example
predictions = tf.math.softmax(outputs.logits, axis=-1) pp.pprint(predictions) Output
<tf.Tensor: shape=(2, 2), dtype=float32, numpy= array([[9.8390251e-01, 1.6097505e-02], [1.9134062e-04, 9.9980873e-01]], dtype=float32)>The output indicates that the first sentence is classified as Negative because the first column has a higher value.
The second sentence is classified as Positive because the second column has a higher value.
In this segment, you developed a good understanding of how a pipeline takes raw input and converts it into probability scores. In the next segment, you will see what happens inside a tokenizer.