Deploy a pretrained PyTorch BERT model from Hugging Face Hub on Amazon SageMaker for sentiment analysis


This notebook’s CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

This us-west-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable


Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine.

BERT was trained on BookCorpus and English Wikipedia data, which contain 800 million words and 2,500 million words, respectively. Training BERT from scratch would be prohibitively expensive. By taking advantage of transfer learning, one can quickly fine tune BERT for another use case with a relatively small amount of training data to achieve state-of-the-art results for common NLP tasks, such as text classification and question answering.

Amazon SageMaker is a fully managed service that provides developers and data scientists with the ability to build, train, and deploy machine learning (ML) models quickly. Amazon SageMaker removes the heavy lifting from each step of the machine learning process to make it easier to develop high-quality models. The SageMaker Python SDK provides open source APIs and containers that make it easy to train and deploy models in Amazon SageMaker with several machine learning and deep learning frameworks.

Our customers often ask for quick fine-tuning and easy deployment of their NLP models.

In this notebook, you will deploy a pretrained PyTorch BERT model from Hugging Face Hub on Amazon SageMaker for sentiment analysis.

You’ll execute the following steps: - Initiate a `Huggingface pipeline <https://huggingface.co/transformers/main_classes/pipelines.html>`__ and save the model and config on the local file system. - Tar GZIP the model and config files, and upload model.tar.gz to a S3 bucket. - Deploy the model to a SageMaker Endpoint and make few inference requests. - Optional cleanup.

Install Python packages

If you run this notebook in SageMaker Studio, you need to make sure ipywidgets is installed and restart the kernel, so please uncomment the code in the next cell, and run it.

[ ]:
# %%capture
# import IPython
# import sys

# !{sys.executable} -m pip install ipywidgets
# IPython.Application.instance().kernel.do_shutdown(True)  # has to restart kernel so changes are used

Then you’ll install Transformers, a state-of-the-art Natural Language Processing for Jax, Pytorch and TensorFlow.

Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between Jax, PyTorch and TensorFlow.

[ ]:
import sys

!{sys.executable} -m pip install Transformers

Let’s start by creating a SageMaker session and specifying:

  • The S3 bucket and prefix that you want to use for the model data. This should be within the same region as the Notebook Instance, training, and hosting.

  • The IAM role arn used to give hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the sagemaker.get_execution_role() with the appropriate full IAM role arn string(s).

[ ]:
import os
import boto3
import sagemaker

role = sagemaker.get_execution_role()
sess = sagemaker.Session()

bucket = sess.default_bucket()
prefix = "sagemaker/pytorch-bert-sentiment-analysis"

Initiate a Huggingface pipeline

The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See the task summary for examples of use.

[ ]:
from transformers import pipeline

sentiment_analysis = pipeline("sentiment-analysis")

Save the pre-trained model on file system

[ ]:
sentiment_analysis.save_pretrained("./model")

Package the pre-trained model and upload it to S3

No you can see that there is a pretrained BERT model under model directory by listing the files in it.

[ ]:
!ls -rtlh ./model/

Now you’ll create a model.tar.gz file to be used by SageMaker endpoint

[ ]:
!cd model && tar czvf ../model.tar.gz *

Upload the model.tar.gz to the bucket in S3 you previously set up.

[ ]:
fObj = open("model.tar.gz", "rb")
key = os.path.join(prefix, "model.tar.gz")
boto3.Session().resource("s3").Bucket(bucket).Object(key).upload_fileobj(fObj)
print(os.path.join(bucket, key))
[ ]:
pretrained_model_data = "s3://{}/{}".format(bucket, key)
pretrained_model_data

Write the Inference Script

To deploy a pretrained PyTorch model, you’ll need to use the PyTorch estimator object to create a PyTorchModel object and set a different entry_point.

You’ll use the PyTorchModel object to deploy a PyTorchPredictor. This creates a SageMaker Endpoint – a hosted prediction service that we can use to perform inference.

An implementation of model_fn is required for inference script. We are going to use default implementations of input_fn, predict_fn, output_fn and model_fn defined in sagemaker-pytorch-containers.

Here’s an example of the inference script:

[ ]:
!pygmentize code/inference.py

Create a model object

You define the model object by using the SageMaker Python SDK’s PyTorchModel and pass in the model from the estimator and the entry_point. The endpoint’s entry point for inference is defined by model_fn as seen in the following code block that prints out inference.py. The function loads the model and sets it to use a GPU, if available.

[ ]:
from sagemaker.pytorch.model import PyTorchModel

pytorch_model = PyTorchModel(
    model_data=pretrained_model_data,
    role=role,
    framework_version="1.7.1",
    source_dir="code",
    py_version="py3",
    entry_point="inference.py",
)

Deploy the model in SageMaker endpoint

The arguments to the deploy function allow us to set the number and type of instances that will be used for the Endpoint.

Here you will deploy the model to a single ml.m5.large instance.

[ ]:
predictor = pytorch_model.deploy(initial_instance_count=1, instance_type="ml.m5.large")

Since in the input_fn we declared that the incoming requests are json-encoded, we need to use a json serializer, To encode the incoming data into a json string. Also, we declared the return content type to be json string, we Need to use a json deserializer to parse the response.

[ ]:
predictor.serializer = sagemaker.serializers.JSONSerializer()
predictor.deserializer = sagemaker.deserializers.JSONDeserializer()

Test the model

Using few samples, you can now invoke the SageMaker endpoint to get predictions.

[ ]:
result = predictor.predict("Never allow the same bug to bite you twice.")
result
[ ]:
result = predictor.predict(
    "The best part of Amazon SageMaker is that it makes machine learning easy."
)
result

You can also invoke the endpoint with a list of sentences

[ ]:
result = predictor.predict(
    [
        "Never allow the same bug to bite you twice.",
        "The best part of Amazon SageMaker is that it makes machine learning easy.",
    ]
)
result

Clean up

Endpoints should be deleted when no longer in use, since (per the SageMaker pricing page) they’re billed by time deployed.

[ ]:
predictor.delete_endpoint(predictor.endpoint)

Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

This us-east-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This us-east-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This us-west-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ca-central-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This sa-east-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-west-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-west-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-west-3 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-central-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-north-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-southeast-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-southeast-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-northeast-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-northeast-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-south-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable