Track an experiment while training a Keras model locally

This notebook’s CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

This notebook shows how you can use the SageMaker SDK to track a Machine Learning experiment.

We introduce two concepts in this notebook -

Experiment: An experiment is a collection of runs. When you initialize a run in your training loop, you include the name of the experiment that the run belongs to. Experiment names must be unique within your AWS account.
Run: A run consists of all the inputs, parameters, configurations, and results for one iteration of model training. Initialize an experiment run for tracking a training job with Run().

In this notebook we train a Keras model using the MNIST dataset. We use a Keras callback to log metrics to an Experiment.

Note: It is recommended to use the TensorFlow 2.6 CPU Optimized image to run this notebook.

[ ]:

import sys

[ ]:

# update boto3 and sagemaker to ensure latest SDK version
!{sys.executable} -m pip install --upgrade pip
!{sys.executable} -m pip install --upgrade boto3
!{sys.executable} -m pip install --upgrade sagemaker
!{sys.executable} -m pip install --upgrade tensorflow
!{sys.executable} -m pip install protobuf==3.20.3

[ ]:

import json
import boto3
import sagemaker
from sagemaker.session import Session
from sagemaker import get_execution_role
from sagemaker.experiments.run import Run

[ ]:

sagemaker_session = Session()
boto_sess = boto3.Session()

role = get_execution_role()
default_bucket = sagemaker_session.default_bucket()

sm = boto_sess.client("sagemaker")
region = boto_sess.region_name

Prepare the data used for training the model

Here we use the mnist dataset available with Keras

[ ]:

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
import pandas as pd

[ ]:

!mkdir -p datasets

[ ]:

# Model / data parameters
num_classes = 10
input_shape = (28, 28, 1)

# Here we download the data from S3

s3 = boto3.client("s3")

train_path = "datasets/input_train.npy"
test_path = "datasets/input_test.npy"
train_labels_path = "datasets/input_train_labels.npy"
test_labels_path = "datasets/input_test_labels.npy"

# Load the data and split it between train and test sets
s3.download_file(
    f"sagemaker-example-files-prod-{region}",
    "datasets/image/MNIST/numpy/input_train.npy",
    train_path,
)
s3.download_file(
    f"sagemaker-example-files-prod-{region}", "datasets/image/MNIST/numpy/input_test.npy", test_path
)
s3.download_file(
    f"sagemaker-example-files-prod-{region}",
    "datasets/image/MNIST/numpy/input_train_labels.npy",
    train_labels_path,
)
s3.download_file(
    f"sagemaker-example-files-prod-{region}",
    "datasets/image/MNIST/numpy/input_test_labels.npy",
    test_labels_path,
)

[ ]:

x_train = np.load(train_path)
x_test = np.load(test_path)
y_train = np.load(train_labels_path)
y_test = np.load(test_labels_path)

# Reshape the arrays

x_train = np.reshape(x_train, (60000, 28, 28))
x_test = np.reshape(x_test, (10000, 28, 28))
y_train = np.reshape(y_train, (60000,))
y_test = np.reshape(y_test, (10000,))

# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255

# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)

print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Construct the model

[ ]:

def get_model(dropout=0.5):
    """ """
    model = keras.Sequential(
        [
            keras.Input(shape=input_shape),
            layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
            layers.MaxPooling2D(pool_size=(2, 2)),
            layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
            layers.MaxPooling2D(pool_size=(2, 2)),
            layers.Flatten(),
            layers.Dropout(dropout),
            layers.Dense(num_classes, activation="softmax"),
        ]
    )
    model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

    return model

Define the Keras callback to log metrics to the run

The Keras Callback class provides a method on_epoch_end which emits metrics at the end of each epoch. All emitted metrics will be logged in the run passed to the callback,

[ ]:

class ExperimentCallback(keras.callbacks.Callback):
    """ """

    def __init__(self, run, model, x_test, y_test):
        """Save params in constructor"""
        self.run = run
        self.model = model
        self.x_test = x_test
        self.y_test = y_test

    def on_epoch_end(self, epoch, logs=None):
        """ """
        keys = list(logs.keys())
        for key in keys:
            self.run.log_metric(name=key, value=logs[key], step=epoch)
            print("{} -> {}".format(key, logs[key]))

Train the model in the notebook and track it in an Experiment

Here we train the keras model locally on the instance that this notebook is running on.

As part of the run, we track each of the input artifacts. These artifacts are written to files before the artifact is logged.

[ ]:

from sagemaker.experiments.run import Run

batch_size = 256
epochs = 5
dropout = 0.5

model = get_model(dropout)

experiment_name = "local-keras-experiment"
with Run(experiment_name=experiment_name, sagemaker_session=sagemaker_session) as run:
    run.log_parameter("batch_size", batch_size)
    run.log_parameter("epochs", epochs)
    run.log_parameter("dropout", dropout)

    run.log_file("datasets/input_train.npy", is_output=False)
    run.log_file("datasets/input_test.npy", is_output=False)
    run.log_file("datasets/input_train_labels.npy", is_output=False)
    run.log_file("datasets/input_test_labels.npy", is_output=False)

    # Train locally
    model.fit(
        x_train,
        y_train,
        batch_size=batch_size,
        epochs=epochs,
        validation_split=0.1,
        callbacks=[ExperimentCallback(run, model, x_test, y_test)],
    )

    score = model.evaluate(x_test, y_test, verbose=0)
    print("Test loss:", score[0])
    print("Test accuracy:", score[1])

    run.log_metric(name="Final Test Loss", value=score[0])
    run.log_metric(name="Final Test Accuracy", value=score[1])

Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.