Amazon SageMaker Model Governance - Model Cards

This notebook’s CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

This notebook walks you through the features of Amazon SageMaker Model Cards. For more information, see Model Cards in the Amazon SageMaker Developer Guide.

Amazon SageMaker Model Cards give you the ability to create a centralized, customizable fact-sheet to document critical details about your machine learning (ML) models. Use model cards to keep a record of model information, such as intended uses, risk ratings, training details, evaluation metrics, and more for streamlined governance and reporting.

In this example, you create a binary classification model along with a model card to document model details along the way. Learn how to create, read, update, delete, and export model cards using the Amazon SageMaker Python SDK.

Setup

To begin, you must specify the following information: - The IAM role ARN used to give SageMaker training and hosting access to your data. The following example uses the SageMaker execution role. - The SageMaker session used to manage interactions with Amazon SageMaker Model Card API methods. - The S3 URI (bucket and prefix) where you want to store training artifacts, models, and any exported model card PDFs. This S3 bucket should be in the same Region as your Notebook Instance, training, and hosting configurations. The following example uses the default SageMaker S3 bucket and creates a default SageMaker S3 bucket if one does not already exist. - The S3 session used to manage interactions with Amazon S3 storage.

[ ]:

! pip install --upgrade sagemaker

[ ]:

import boto3
from sagemaker.session import Session
from sagemaker import get_execution_role

role = get_execution_role()

sagemaker_session = Session()

bucket = sagemaker_session.default_bucket()
prefix = "model-card-sample-notebook"

region = sagemaker_session.boto_region_name
s3 = boto3.client("s3", region_name=region)

Next, import the necessary Python libraries.

[ ]:

import io
import os
import numpy as np
from six.moves.urllib.parse import urlparse
from pprint import pprint
import boto3
import sagemaker
from sagemaker.image_uris import retrieve
import sagemaker.amazon.common as smac
from sagemaker.model_card import (
    ModelCard,
    ModelOverview,
    ObjectiveFunction,
    Function,
    TrainingDetails,
    IntendedUses,
    BusinessDetails,
    EvaluationJob,
    AdditionalInformation,
    Metric,
    MetricGroup,
    ModelCardStatusEnum,
    ObjectiveFunctionEnum,
    FacetEnum,
    RiskRatingEnum,
    MetricTypeEnum,
    EvaluationMetricTypeEnum,
)

Prepare a Model

The following code creates an example binary classification model trained on a synthetic dataset. The target variable (0 or 1) is the second variable in the tuple.

1. Prepare the training data

The code will upload example data to your S3 bucket.

[ ]:

# synthetic data
raw_data = (
    (0.5, 0),
    (0.75, 0),
    (1.0, 0),
    (1.25, 0),
    (1.50, 0),
    (1.75, 0),
    (2.0, 0),
    (2.25, 1),
    (2.5, 0),
    (2.75, 1),
    (3.0, 0),
    (3.25, 1),
    (3.5, 0),
    (4.0, 1),
    (4.25, 1),
    (4.5, 1),
    (4.75, 1),
    (5.0, 1),
    (5.5, 1),
)
training_data = np.array(raw_data).astype("float32")
labels = training_data[:, 1]

# upload data to S3 bucket
buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, training_data, labels)
buf.seek(0)
boto3.resource("s3").Bucket(bucket).Object(os.path.join(prefix, "train")).upload_fileobj(buf)

2. Train a model

Train a binary classification model with the training data from the previous step.

[ ]:

s3_train_data = f"s3://{bucket}/{prefix}/train"
output_location = f"s3://{bucket}/{prefix}/output"
container = retrieve("linear-learner", sagemaker_session.boto_session.region_name)
estimator = sagemaker.estimator.Estimator(
    container,
    role=role,
    instance_count=1,
    instance_type="ml.m4.xlarge",
    output_path=output_location,
    sagemaker_session=sagemaker_session,
)
estimator.set_hyperparameters(feature_dim=2, mini_batch_size=10, predictor_type="binary_classifier")
estimator.fit({"train": s3_train_data})
print(f"Training job name: {estimator.latest_training_job.name}")

3. Create a model

[ ]:

model_name = "model-card-test-model"
model = estimator.create_model(name=model_name)
container_def = model.prepare_container_def()
sagemaker_session.create_model(model_name, role, container_def)
print(f"Model name: {model_name}")

Create Model Card

Document your binary classification model details in an Amazon SageMaker Model Card using the SageMaker Python SDK.

1. Auto-collect model details

Automatically collect basic model information like model ID, training environment, and the model output S3 URI. Add additional model information such as a description, problem type, algorithm type, model creator, and model owner.

[ ]:

model_overview = ModelOverview.from_model_name(
    model_name=model_name,
    sagemaker_session=sagemaker_session,
    model_description="This is an example binary classification model used for a Python SDK demo of Amazon SageMaker Model Cards.",
    problem_type="Binary Classification",
    algorithm_type="Logistic Regression",
    model_creator="DEMO-ModelCard",
    model_owner="DEMO-ModelCard",
)
print(f"Model id: {model_overview.model_id}")
print(f"Model training images: {model_overview.inference_environment.container_image}")
print(f"Model: {model_overview.model_artifact}")

2. Auto-collect training details

Automatically collect basic training information like training ID, training environment, and training metrics. Add additional training information such as objective function details and training observations.

[ ]:

objective_function = ObjectiveFunction(
    function=Function(
        function=ObjectiveFunctionEnum.MINIMIZE,
        facet=FacetEnum.LOSS,
    ),
    notes="This is an example objective function.",
)
training_details = TrainingDetails.from_model_overview(
    model_overview=model_overview,
    sagemaker_session=sagemaker_session,
    objective_function=objective_function,
    training_observations="Add model training observations here.",
)
print(f"Training job id: {training_details.training_job_details.training_arn}")
print(
    f"Training image: {training_details.training_job_details.training_environment.container_image}"
)
print("Training Metrics: ")
pprint(
    [
        {"name": i.name, "value": i.value}
        for i in training_details.training_job_details.training_metrics
    ]
)

3. Collect evaluation details

Add evaluation observations, datasets, and metrics.

[ ]:

manual_metric_group = MetricGroup(
    name="binary classification metrics",
    metric_data=[Metric(name="accuracy", type=MetricTypeEnum.NUMBER, value=0.5)],
)
example_evaluation_job = EvaluationJob(
    name="Example evaluation job",
    evaluation_observation="Evaluation observations.",
    datasets=["s3://path/to/evaluation/data"],
    metric_groups=[manual_metric_group],
)
evaluation_details = [example_evaluation_job]

(Optional) 3.1 Parse metrics from existing evaluation report

If you have existing evaluation reports generated by SageMaker Clarify or SageMaker Model Monitor, upload them to S3 and provide an S3 URI to automatically parse evaluation metrics. To add your own generic model card evaluation report, provide a report in the evaluation results JSON format. See the example JSON files in the ./example_metrics folder for reference. ##### Collect metrics from a JSON format evaluation report

[ ]:

report_type = "clarify_bias.json"
example_evaluation_job.add_metric_group_from_json(
    f"example_metrics/{report_type}", EvaluationMetricTypeEnum.CLARIFY_BIAS
)

Collect metrics from S3

[ ]:

# upload metric file to s3
with open(f"example_metrics/{report_type}", "rb") as metrics:
    s3.upload_fileobj(
        metrics,
        Bucket=bucket,
        Key=f"{prefix}/{report_type}",
        ExtraArgs={"ContentType": "application/json"},
    )

metric_s3_url = f"s3://{bucket}/{prefix}/{report_type}"
example_evaluation_job.add_metric_group_from_s3(
    session=sagemaker_session.boto_session,
    s3_url=metric_s3_url,
    metric_type=EvaluationMetricTypeEnum.CLARIFY_BIAS,
)

4. Collect additional details

Add the intended uses of your model and business details and any additional information that you want to include in your model card. For more information on intended uses and business details, see Model Cards in the Amazon SageMaker Developer Guide.

[ ]:

intended_uses = IntendedUses(
    purpose_of_model="Test model card.",
    intended_uses="Not used except this test.",
    factors_affecting_model_efficiency="No.",
    risk_rating=RiskRatingEnum.LOW,
    explanations_for_risk_rating="Just an example.",
)
business_details = BusinessDetails(
    business_problem="The business problem that your model is used to solve.",
    business_stakeholders="The stakeholders who have the interest in the business that your model is used for.",
    line_of_business="Services that the business is offering.",
)
additional_information = AdditionalInformation(
    ethical_considerations="Your model ethical consideration.",
    caveats_and_recommendations="Your model's caveats and recommendations.",
    custom_details={"custom details1": "details value"},
)

5. Initialize a model card

Initialize a model card with the information collected in the previous steps.

[ ]:

model_card_name = "sample-notebook-model-card"
my_card = ModelCard(
    name=model_card_name,
    status=ModelCardStatusEnum.DRAFT,
    model_overview=model_overview,
    training_details=training_details,
    intended_uses=intended_uses,
    business_details=business_details,
    evaluation_details=evaluation_details,
    additional_information=additional_information,
    sagemaker_session=sagemaker_session,
)
my_card.create()
print(f"Model card {my_card.name} is successfully created with id {my_card.arn}")

Update Model Card

After creating a model card, you can update the model card information. Updating a model card creates a new model card version.

[ ]:

my_card.model_overview.model_description = "the model is updated."
my_card.update()

Load Model Card

Load an existing model card with the model card name.

[ ]:

my_card2 = ModelCard.load(
    name=model_card_name,
    sagemaker_session=sagemaker_session,
)
print(f"Model id: {my_card2.arn}")
print(f"Model description: {my_card.model_overview.model_description}")

List Model Card History

Track the model card history by listing historical versions.

[ ]:

history = my_card.get_version_history()
assert len(history) == 2  # one for creation and one for update

Export Model Card

Share the model card by exporting it to a PDF file.

1. Create an export job

[ ]:

s3_output_path = f"s3://{bucket}/{prefix}/export"
pdf_s3_url = my_card.export_pdf(s3_output_path=s3_output_path)

(optional) List export jobs

Check all the export jobs for this model card.

[ ]:

my_card.list_export_jobs()

2. Download the exported model card PDF

The downloaded PDF is stored in the same directory as this notebook by default.

Parse the bucket and key of the exported PDF

[ ]:

parsed_url = urlparse(pdf_s3_url)
pdf_bucket = parsed_url.netloc
pdf_key = parsed_url.path.lstrip("/")

Download

[ ]:

file_name = parsed_url.path.split("/")[-1]
s3.download_file(Filename=file_name, Bucket=pdf_bucket, Key=pdf_key)
print(f"{file_name} is downloaded to \n{os.path.join(os.getcwd(), file_name)}")

Cleanup

Delete the following resources: 1. The model card 2. The exported model card PDF 3. The example binary classification model

[ ]:

my_card.delete()

s3.delete_object(Bucket=pdf_bucket, Key=pdf_key)

sagemaker_session.delete_model(model_name)

Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.