Introduction to JumpStart - Text Embedding
This notebook’s CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.
Note: This notebook was tested on ml.t3.medium instance in Amazon SageMaker Studio with Python 3 (Data Science) kernel and in Amazon SageMaker Notebook instance with conda_python3 kernel.
1. Set Up
[ ]:
!pip install sagemaker ipywidgets --upgrade --quiet
Permissions and environment variables
[ ]:
import sagemaker, boto3, json
from sagemaker import get_execution_role
aws_role = get_execution_role()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()
2. Select a model
Here, we download jumpstart model_manifest file from the jumpstart s3 bucket, filter-out all the Text Embedding models and select a model for inference. ***
[ ]:
from ipywidgets import Dropdown
# download JumpStart model_manifest file.
boto3.client("s3").download_file(
f"jumpstart-cache-prod-{aws_region}", "models_manifest.json", "models_manifest.json"
)
with open("models_manifest.json", "rb") as json_file:
model_list = json.load(json_file)
# filter-out all the Text Embedding models from the manifest list.
text_embedding_models = []
for model in model_list:
model_id = model["model_id"]
if "-tcembedding-" in model_id and model_id not in text_embedding_models:
text_embedding_models.append(model_id)
# display the model-ids in a dropdown to select a model for inference.
model_dropdown = Dropdown(
options=text_embedding_models,
value="tensorflow-tcembedding-bert-en-uncased-L-10-H-128-A-2-2",
description="Select a model",
style={"description_width": "initial"},
layout={"width": "max-content"},
)
Chose a model for Inference
[ ]:
display(model_dropdown)
[ ]:
# model_version="*" fetches the latest version of the model
model_id, model_version = model_dropdown.value, "*"
3. Retrieve JumpStart Artifacts & Deploy an Endpoint
We start by retrieving the deploy_image_uri
, deploy_source_uri
, and model_uri
for the pre-trained model. To host the pre-trained model, we create an instance of `sagemaker.model.Model
<https://sagemaker.readthedocs.io/en/stable/api/inference/model.html>`__ and deploy it. This may take a few minutes. ***
[ ]:
from sagemaker import image_uris, model_uris, script_uris, hyperparameters
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.utils import name_from_base
endpoint_name = name_from_base(f"jumpstart-example-infer-{model_id}")
inference_instance_type = "ml.p2.xlarge"
# Retrieve the inference docker container uri. This is the base Tensorflow container image for the default model above.
deploy_image_uri = image_uris.retrieve(
region=None,
framework=None, # automatically inferred from model_id
image_scope="inference",
model_id=model_id,
model_version=model_version,
instance_type=inference_instance_type,
)
# Retrieve the inference script uri. This includes all dependencies and scripts for model loading, inference handling etc.
deploy_source_uri = script_uris.retrieve(
model_id=model_id, model_version=model_version, script_scope="inference"
)
# Retrieve the model uri. This includes the model and model parameters.
model_uri = model_uris.retrieve(
model_id=model_id, model_version=model_version, model_scope="inference"
)
# Create the SageMaker model instance
model = Model(
image_uri=deploy_image_uri,
source_dir=deploy_source_uri,
model_data=model_uri,
entry_point="inference.py", # entry point file in source_dir and present in deploy_source_uri
role=aws_role,
predictor_cls=Predictor,
name=endpoint_name,
)
# deploy the Model. Note that we need to pass Predictor class when we deploy model through Model class,
# for being able to run inference through the sagemaker API.
model_predictor = model.deploy(
initial_instance_count=1,
instance_type=inference_instance_type,
predictor_cls=Predictor,
endpoint_name=endpoint_name,
)
4. Query endpoint and parse response
[ ]:
def query(model_predictor, text):
"""Query the model predictor."""
encoded_text = text.encode("utf-8")
query_response = model_predictor.predict(
encoded_text,
{
"ContentType": "application/x-text",
"Accept": "application/json",
},
)
return query_response
def parse_response(query_response):
"""Parse response and return the embedding."""
model_predictions = json.loads(query_response)
translation_text = model_predictions["embedding"]
return translation_text
[ ]:
newline, bold, unbold = "\n", "\033[1m", "\033[0m"
input_text = "astonishing ... ( frames ) profound ethical and philosophical questions in the form of dazzling pop entertainment"
query_response = query(model_predictor, input_text)
embedding = parse_response(query_response)
print(
f"{bold}Inference{unbold}:{newline}"
f"{bold}Input text sentence{unbold}: '{input_text}'{newline}"
f"{bold}The first 5 elements of sentence embedding{unbold}: {embedding[:5]}{newline}"
f"{bold}Sentence embedding size{unbold}: {len(embedding)}{newline}"
)
5. Semantic Textual Similarity
A use case of sentence embedding is to cluster together sentences with similar semantic meaning. In the example below we compute the embeddings of sentences in three categories: pets, cities in the U.S., and color. We see that sentences originating from the same category have much closer embedding vectors than those from different categories.
Note. Cosine similarity of two vectors is the inner product of the normalized vectors (scale down to have length 1).
[ ]:
from sklearn.preprocessing import normalize
import numpy as np
import seaborn as sns
def plot_similarity_heatmap(text_labels, embeddings, rotation):
"""Takes sentences, embeddings and rotation as input and plot similarity heat map.
Args:
text_labels: a list of sentences to compute semantic textual similarity search.
embeddings: a list of embedding vectors, each of which corresponds to a sentence.
rotation: rotation used for display of the text_labels.
"""
inner_product = np.inner(embeddings, embeddings)
sns.set(font_scale=1.1)
graph = sns.heatmap(
inner_product,
xticklabels=text_labels,
yticklabels=text_labels,
vmin=np.min(inner_product),
vmax=1,
cmap="OrRd",
)
graph.set_xticklabels(text_labels, rotation=rotation)
graph.set_title("Semantic Textual Similarity Between Sentences")
sentences = [
# Pets
"Your dog is so cute.",
"How cute your dog is!",
"You have such a cute dog!",
# Cities in the US
"New York City is the place where I work.",
"I work in New York City.",
# Color
"What color do you like the most?",
"What is your favourite color?",
]
embeddings = []
for sentence in sentences:
query_response = query(model_predictor, sentence)
embedding = parse_response(query_response)
embeddings.append(embedding)
embeddings = normalize(np.array(embeddings), axis=1) # normalization before inner product
plot_similarity_heatmap(sentences, embeddings, 90)
6. Clean up the endpoint
[ ]:
# Delete the SageMaker endpoint
model_predictor.delete_model()
model_predictor.delete_endpoint()
This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.