California Housing Regression Experiment

This demo shows how you can use SageMaker Experiments Management Python SDK to organize, track, compare, and evaluate your machine learning (ML) model training experiments.

You can track artifacts for experiments, including data sets, algorithms, hyper-parameters, and metrics. Experiments executed on SageMaker such as SageMaker Autopilot jobs and training jobs will be automatically tracked. You can also track artifacts for additional steps within an ML workflow that come before/after model training e.g. data pre-processing or post-training model evaluation.

The APIs also let you search and browse your current and past experiments, compare experiments, and identify best performing models.

Now we will demonstrate these capabilities through a California Housing regression example. The experiment will be organized as follows:

  1. Download and prepare the California Housing dataset.

  2. Train an Artificial Neural Network (ANN) Model. Tune the hyper parameter that configures the number of epochs and the learning_rate in the model. Track the parameter configurations and resulting model validation loss using SageMaker Experiments Python SDK.

  3. Finally, use the search and analytics capabilities of Python SDK to search, compare, evaluate and visualize the performance of all model versions generated from model tuning in Step 2.

Make sure you selected Python 3 (TensorFlow 2.3 Python 3.7 CPU Optimized) kernel.

Install Python Packages

[ ]:
import sys

!{sys.executable} -m pip install sagemaker-experiments==0.1.31 matplotlib


[ ]:
import os
import time
import boto3
import itertools
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sagemaker.tensorflow import TensorFlow
import sagemaker
from sagemaker import get_execution_role
[ ]:
sess = boto3.Session()
sm = sess.client("sagemaker")
role = get_execution_role()
sagemaker_session = sagemaker.Session(boto_session=sess)
bucket = sagemaker_session.default_bucket()
prefix = "tf2-california-housing-experiment"

Download California Housing dataset and upload to Amazon S3

[ ]:
data_dir = os.path.join(os.getcwd(), "data")
os.makedirs(data_dir, exist_ok=True)

train_dir = os.path.join(os.getcwd(), "data/train")
os.makedirs(train_dir, exist_ok=True)

test_dir = os.path.join(os.getcwd(), "data/test")
os.makedirs(test_dir, exist_ok=True)

data_set = fetch_california_housing()

X = pd.DataFrame(, columns=data_set.feature_names)
Y = pd.DataFrame(

# We partition the dataset into 2/3 training and 1/3 test set.
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.33)

scaler = StandardScaler()
x_train = scaler.transform(x_train)
x_test = scaler.transform(x_test), "x_train.npy"), x_train), "x_test.npy"), x_test), "y_train.npy"), y_train), "y_test.npy"), y_test)
[ ]:
s3_inputs_train = sagemaker.Session().upload_data(
    path="data/train", bucket=bucket, key_prefix=prefix + "/train"
s3_inputs_test = sagemaker.Session().upload_data(
    path="data/test", bucket=bucket, key_prefix=prefix + "/test"
inputs = {"train": s3_inputs_train, "test": s3_inputs_test}

Step 1 - Set up the Experiment

Create an experiment to track all the model training iterations. Experiments are a great way to organize your data science work. You can create experiments to organize all your model development work for : [1] a business use case you are addressing (e.g. create experiment named “customer churn prediction”), or [2] a data science team that owns the experiment (e.g. create experiment named “marketing analytics experiment”), or [3] a specific data science and ML project. Think of it as a “folder” for organizing your “files”.

Create an Experiment

[ ]:
from smexperiments.experiment import Experiment
from smexperiments.trial import Trial
from smexperiments.trial_component import TrialComponent
from smexperiments.tracker import Tracker
[ ]:
california_housing_experiment = Experiment.create(
    description="Training on california housing dataset",

Step 2 - Track Experiment

Now create a Trial for each training run to track its inputs, parameters, and metrics.

While training the ResNet-50 CNN model on SageMaker, you will experiment with several values for the number of hidden channel in the model. You will create a Trial to track each training job run. You will also create a TrialComponent from the tracker we created before, and add to the Trial. This will enrich the Trial with the parameters we captured from the data pre-processing stage.

[ ]:
hyperparam_options = {"learning_rate": [0.1, 0.5, 0.9], "epochs": [100, 200]}

hypnames, hypvalues = zip(*hyperparam_options.items())
trial_hyperparameter_set = [dict(zip(hypnames, h)) for h in itertools.product(*hypvalues)]

If you want to run the following training jobs asynchronously, you may need to increase your resource limit. Otherwise, you can run them sequentially.

Note the execution of the following code takes around half an hour.

[ ]:
from sagemaker.tensorflow import TensorFlow

run_number = 1
for trial_hyp in trial_hyperparameter_set:
    # Combine static hyperparameters and trial specific hyperparameters
    hyperparams = trial_hyp

    # Create unique job name with hyperparameter and time
    time_append = int(time.time())
    hyp_append = "-".join([str(elm).replace(".", "-") for elm in trial_hyp.values()])
    training_job_name = f"tf2-california-housing-training-{hyp_append}-{time_append}"
    trial_name = f"trial-tf2-california-housing-training-{hyp_append}-{time_append}"
    trial_desc = f"my-tensorflow2-california-housing-run-{run_number}"

    # Create a new Trial and associate Tracker to it
    tf2_california_housing_trial = Trial.create(
        tags=[{"Key": "trial-desc", "Value": trial_desc}],

    # Create an experiment config that associates training job to the Trial
    experiment_config = {
        "ExperimentName": california_housing_experiment.experiment_name,
        "TrialName": tf2_california_housing_trial.trial_name,
        "TrialComponentDisplayName": training_job_name,

    metric_definitions = [
        {"Name": "loss", "Regex": "loss: ([0-9\\.]+)"},
        {"Name": "accuracy", "Regex": "accuracy: ([0-9\\.]+)"},
        {"Name": "val_loss", "Regex": "val_loss: ([0-9\\.]+)"},
        {"Name": "val_accuracy", "Regex": "val_accuracy: ([0-9\\.]+)"},

    # Create a TensorFlow Estimator with the Trial specific hyperparameters
    tf2_california_housing_estimator = TensorFlow(
        tags=[{"Key": "trial-desc", "Value": trial_desc}],

    # Launch a training job
        inputs, job_name=training_job_name, experiment_config=experiment_config

    # give it a while before dispatching the next training job
    run_number = run_number + 1

Compare the model training runs for an experiment

Now you will use the analytics capabilities of Python SDK to query and compare the training runs for identifying the best model produced by our experiment. You can retrieve trial components by using a search expression.

[ ]:
from import ExperimentAnalytics

experiment_name = california_housing_experiment.experiment_name

trial_component_analytics = ExperimentAnalytics(
    sagemaker_session=sagemaker_session, experiment_name=experiment_name
trial_comp_ds_jobs = trial_component_analytics.dataframe()

Let’s show the accuracy, epochs and optimizer. You will sort the results by accuracy descending.

[ ]:
trial_comp_ds_jobs = trial_comp_ds_jobs.sort_values("val_loss - Last", ascending=False)
trial_comp_ds_jobs[["TrialComponentName", "val_loss - Last", "epochs", "learning_rate"]]

Visualize experiment

Now we visualize the epochs/learning_rate vs. loss in descending order

[ ]:
import matplotlib.pyplot as plt

trial_comp_ds_jobs["col_names"] = (
    + "-0."
    + ((trial_comp_ds_jobs["learning_rate"]) * 10).astype("Int64").astype("str")

fig = plt.figure()
fig.set_size_inches([15, 10])"col_names", "val_loss - Last", ax=plt.gca())

Compare Experiments, Trials, and Trial Components in Amazon SageMaker Studio

You can compare experiments, trials, and trial components by selecting the entities and opening them in the trial components list. The trial components list is referred to as the Studio Leaderboard. In the Leaderboard you can do the following: - View detailed information about the entities - Compare entities - Stop a training job - Deploy a model

To compare experiments, trials, and trial components - In the left sidebar of SageMaker Studio, choose the SageMaker Experiment List icon. - In the Experiments browser, choose either the experiment or trial list. - Choose the experiments or trials that you want to compare, right-click the selection, and then choose Open in trial component list. The Leaderboard opens and lists the associated Experiments entities as shown in the following screenshot.



Run the following cell to clean up the sample experiment. If you are working on your own experiment, please ignore.

[ ]:
def cleanup(experiment):
    for trial_summary in experiment.list_trials():
        trial = Trial.load(sagemaker_boto_client=sm, trial_name=trial_summary.trial_name)
        for trial_component_summary in trial.list_trial_components():
            tc = TrialComponent.load(
                # comment out to keep trial components
                # tc is associated with another trial
            # to prevent throttling
[ ]:
[ ]: