Deploy pre-trained GluonCV SSD Mobilenet model with SageMaker Neo
Introduction
This example demonstrates how to load a pre-trained MXNet GluonCV SSD model, optimize the trained model using SageMaker Neo, and host the model.
Setup
To compile and deploy the ssd mobilenet model on Amazon SageMaker, we need to setup and authenticate the use of AWS services.
To start, we need to upgrade the SageMaker SDK for Python to v2.33.0 or greater and latest MXNet GluonCV and restart the kernel.
[ ]:
!~/anaconda3/envs/mxnet_p36/bin/pip install --upgrade sagemaker>=2.33.0 gluoncv
Then we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3. We also create a session.
[ ]:
import sagemaker
from sagemaker import get_execution_role
role = get_execution_role()
sess = sagemaker.Session()
We then need an S3 bucket that would be used for storing the model artifacts generated after training and compilation, training data and custom code.
[ ]:
# S3 bucket and folders for saving code and model artifacts.
# Feel free to specify different bucket/folders here if you wish.
bucket = sess.default_bucket()
folder = "DEMO-ObjectDetection-SSD-MobileNet"
pretrained_model_sub_folder = folder + "/pretrained-model"
compilation_output_sub_folder = folder + "/compilation-output"
To easily visualize the detection outputs we also define the following function. The function visualizes the high-confidence predictions with bounding box by filtering out low-confidence detections.
[ ]:
%matplotlib inline
def visualize_detection(img_file, dets, classes=[], thresh=0.6):
"""
visualize detections in one image
Parameters:
----------
img_file : numpy.array
image, in bgr format
dets : numpy.array
ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])
each row is one object
classes : tuple or list of str
class names
thresh : float
score threshold
"""
import random
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from matplotlib.patches import Rectangle
img = mpimg.imread(img_file)
plt.imshow(img)
height = img.shape[0]
width = img.shape[1]
colors = dict()
klasses = dets[0][0]
scores = dets[1][0]
bbox = dets[2][0]
for i in range(len(classes)):
klass = klasses[i][0]
score = scores[i][0]
x0, y0, x1, y1 = bbox[i]
if score < thresh:
continue
cls_id = int(klass)
if cls_id not in colors:
colors[cls_id] = (random.random(), random.random(), random.random())
xmin = int(x0 * width / 512)
ymin = int(y0 * height / 512)
xmax = int(x1 * width / 512)
ymax = int(y1 * height / 512)
rect = Rectangle(
(xmin, ymin),
xmax - xmin,
ymax - ymin,
fill=False,
edgecolor=colors[cls_id],
linewidth=3.5,
)
plt.gca().add_patch(rect)
class_name = str(cls_id)
if classes and len(classes) > cls_id:
class_name = classes[cls_id]
plt.gca().text(
xmin,
ymin - 2,
"{:s} {:.3f}".format(class_name, score),
bbox=dict(facecolor=colors[cls_id], alpha=0.5),
fontsize=12,
color="white",
)
plt.tight_layout(rect=[0, 0, 2, 2])
plt.show()
[ ]:
# Initializing object categories
object_categories = [
"aeroplane",
"bicycle",
"bird",
"boat",
"bottle",
"bus",
"car",
"cat",
"chair",
"cow",
"diningtable",
"dog",
"horse",
"motorbike",
"person",
"pottedplant",
"sheep",
"sofa",
"train",
"tvmonitor",
]
# Setting a threshold 0.20 will only plot detection results that have a confidence score greater than 0.20
threshold = 0.20
Finally, we load the test image into the memory. The test image used in this notebook is from PEXELS which remains unseen until the time of prediction.
[ ]:
import PIL.Image
import numpy as np
test_file = "test.jpg"
test_image = PIL.Image.open(test_file)
test_image = np.asarray(test_image.resize((512, 512)))
Import SSD Mobilenet model from MXNet GluonCV
This example uses pre-trained MXNet GluonCV SSD model initially published in: > Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg. SSD: Single Shot MultiBox Detector. ECCV 2016.
[ ]:
import numpy as np
import mxnet as mx
import gluoncv as gcv
import tarfile
net = gcv.model_zoo.get_model("ssd_512_mobilenet1.0_voc", pretrained=True)
net.hybridize()
net(mx.nd.ones((1, 3, 512, 512)))
net.export("model")
tar = tarfile.open("ssd_512_mobilenet1.0_voc.tar.gz", "w:gz")
for name in ["model-0000.params", "model-symbol.json"]:
tar.add(name)
tar.close()
Upload model to S3
Upload the pre-trained model to the S3 bucket.
[ ]:
pretrained_model_path = sess.upload_data(
path="ssd_512_mobilenet1.0_voc.tar.gz", bucket=bucket, key_prefix=pretrained_model_sub_folder
)
Next, we need to setup training and compilation output locations in S3, where the respective model artifacts will be dumped. We also setup the s3 location for training data and custom code.
[ ]:
# S3 Location to save the model artifact after training
s3_pretrained_model_location = "s3://{}/{}".format(bucket, pretrained_model_sub_folder)
# S3 Location to save the model artifact after compilation
s3_compilation_output_location = "s3://{}/{}".format(bucket, compilation_output_sub_folder)
Use sagemaker MXNetModel to load pretrained MXNet model
When loading the model, user is expected to provide the entry_point
script required by the model. We set MMS_DEFAULT_RESPONSE_TIMEOUT
environment variable to 500
for MXNet model.
[ ]:
from sagemaker.mxnet.model import MXNetModel
from sagemaker.mxnet import MXNetPredictor
pre_trained_model = MXNetModel(
model_data=pretrained_model_path,
predictor_cls=MXNetPredictor,
framework_version="1.8",
role=role,
sagemaker_session=sess,
entry_point="ssd_entry_point.py",
py_version="py3",
env={"MMS_DEFAULT_RESPONSE_TIMEOUT": "500"},
)
Compile the pre-trained model using SageMaker Neo
After loading the pretrained model we can use SageMaker Neo’s compile()
API to compile the pretrained model. When calling compile()
, the user is expected to provide all the correct input shapes required by the model for successful compilation. We also specify the target instance family, the name of our IAM execution role, S3 bucket to which the compiled model would be stored.
For this example, we will choose ml_p3
as the target instance family while compiling the trained model.
[ ]:
%%time
import time
compiled_model = pre_trained_model.compile(
job_name="ssd-512-mobilenet-{}".format(time.strftime("%Y%m%d%I%M%S")),
target_instance_family="ml_p3",
input_shape={"data": [1, 3, 512, 512]},
role=role,
framework="mxnet",
framework_version="1.8",
output_path=s3_compilation_output_location,
)
Deploy the compiled model and request Inferences
We have to deploy the compiled model within the instance family for which the trained model was compiled. Since we have compiled for ml_p3
we can deploy to any ml.p3
instance type. For this example we will choose ml.p3.2xlarge
[ ]:
%%time
neo_object_detector = compiled_model.deploy(initial_instance_count=1, instance_type="ml.p3.2xlarge")
[ ]:
%%time
response = neo_object_detector.predict(test_image)
[ ]:
# Visualize the detections.
visualize_detection(test_file, response, object_categories, threshold)
Delete the Endpoint
Having an endpoint running will incur some costs. Therefore, as an optional clean-up job, you can delete it.
[ ]:
print("Endpoint name: " + neo_object_detector.endpoint_name)
neo_object_detector.delete_endpoint()