Scale interactive experimentation to scheduled jobs on SageMaker Studio Notebooks without changing code.


This notebook’s CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

This us-west-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable


In addition to the the interactive ML experience that Amazon SageMaker Studio provides, data workers also seek solutions to execute notebooks as batch jobs without the need to refactor code as python modules and without having to learn DevOps tools and best practices to automate and manage their deployment infrastructure. Some common usecases for doing this include:

  • Executing model inference regularly to generate reports

  • Scaling up a feature engineering step after having tested in Studio against a small subset of data on a small instance

  • Re-training and deploying models on events of new data arrivals

SageMaker Studio provides an built-in extension for you to run your notebooks as-is or in a parameterized fashion. You can run these notebooks on a schedule or immedietely with the run-now capability. There’s no need for the end user to modify their existing notebook code.

arch

Prerequisites

In order to utilize the SageMaker Job Scheduling extension you’ll need to have JupyterLab3 enabled for your Studio IDE. More information on this process can be found in the documentation

Example Background

Customer Churn Use Case

Losing customers is costly for any business. For our scheduled notebook workflow, we’re going to use a familiar example of customer churn: leaving a mobile phone operator. If a provider knows that a customer is thinking of leaving, it can offer timely incentives and the customer may stick around. Incentives are often much more cost-effective than losing and reacquiring a customer.

Solution Overview

We’re going to build an end to end workflow which will train a refreshed machine learning model on recent data and then generate a daily report. The process will be as follows:

  1. Gather “the last two weeks” of carrier data. In production, this may be from a data lake or feature store.

  2. Process that data in order to train a refreshed machine learning model.

  3. Train a scikit-learn RandomForest model on the previous data

  4. Plot the Confusion matrix and F1 score of our refreshed model so that we can understand it’s effectiveness on known data

  5. Run inference using our refreshed model on “todays” recent data to determine which accounts are at risk of leaving the carrier

  6. Schedule this notebook to execute every day.

Install or Upgrade Packages

[ ]:
# Install Reinvent Wheels
! pip install --upgrade sagemaker

Parameterize Notebook

Using notebook cell metadata, we can mark a cell with the tag “parameters” if we want to modify any variables during automated execution. For this example, we’ll specify default hyperparamter values which can be modified for our notebook job. To mark a cell as parameter inputs, simply select the wheel icon on the right panel and add a “parameters” tag:

params

[ ]:
# These parameters can be modified when executing this notebook as a job
number_rf_estimators = 100

Prepare Data

In this example we’ll use a synthetic dataset from a Telecommunications company that we’ll use to predict whether or not a particular customer will leave or churn. For a more in-depth exploratory data analysis, view this example

[ ]:
# You can download additional assets for your use in your notebook jobs
import boto3

s3 = boto3.client("s3")
s3.download_file(
    f"sagemaker-example-files-prod-{boto3.session.Session().region_name}",
    "datasets/scripts/sagemaker-studio-scheduling/synthetic_data.py",
    "synthetic_data.py",
)
[ ]:
import pandas as pd
from synthetic_data import generate_data


previous_two_weeks_data = generate_data(5000, label_known=True)
todays_data = generate_data(300, label_known=False)

previous_two_weeks_data.head()
[ ]:
def process_data(df, label_known):
    """
    This function represents typical data munging that needs to be performed for machine learning workflows.
    We'll cast data types, drop columns that are irrelevant as features to the model, and convert categorical
    features into indicator variables. If the data is not from the ground truth dataset, we'll drop it's churn label.
    """
    df = df.drop("Phone", axis=1)
    df["Area Code"] = df["Area Code"].astype(object)
    df = df.drop(["Day Charge", "Eve Charge", "Night Charge", "Intl Charge"], axis=1)
    model_data = pd.get_dummies(df)

    if label_known:
        model_data = pd.concat(
            [
                model_data["Churn?_True."],
                model_data.drop(["Churn?_False.", "Churn?_True."], axis=1),
            ],
            axis=1,
        )
        model_data = model_data.rename(columns={"Churn?_True.": "Churn"})

    return model_data
[ ]:
processed_prior_data = process_data(previous_two_weeks_data, label_known=True)
processed_todays_data = process_data(todays_data, label_known=False)

Train a refreshed model

In our example, we’re going to train a refreshed model on the previous 2 weeks of data. In production, you may wish to train a refreshed model or even fine-tune an existing one

[ ]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score, confusion_matrix, ConfusionMatrixDisplay

y = np.ravel(processed_prior_data[["Churn"]])
x = processed_prior_data.drop(["Churn"], axis=1)

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25)

clf = RandomForestClassifier(n_estimators=int(number_rf_estimators), criterion="gini")
clf.fit(x_train, y_train)

Generate predictions from the refreshed model on our validation dataset

[ ]:
# Run Inference on test set
from IPython.display import Markdown as md

y_pred = clf.predict(x_test)
f1 = f1_score(y_test, y_pred)
md("## Refreshed Model's F1 Score on validation data: {:.2f}".format(f1))
[ ]:
# Plot test set results
cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=["Churn", "Didn't Churn"])
disp.plot()
plt.show()

Run inference on today’s data

[ ]:
todays_predictions = clf.predict(processed_todays_data)
accounts_likely_to_churn = pd.concat(
    [pd.DataFrame(todays_predictions, columns=["Predicted Churn"]), processed_todays_data], axis=1
)
accounts_likely_to_churn = accounts_likely_to_churn[
    accounts_likely_to_churn["Predicted Churn"] == 1
]

accounts_likely_to_churn.head()

Store predictions

In a production setting, you may want to store these predictions into a database such as DynamoDB or trigger a corresponding action based on the account-ids. Alternatively, you may want to register this particular model into SageMaker Model Registry.

Run the Notebook as a job to visualize how the daily report will look

  1. Right click your notebook and select “Create Notebook Job”

create job

  1. Alternatively you can select the “Notebook job” icon on your notebook bar

job_icon

  1. Choose the right instance type for your scheduled job based on your workload - Standard Instances, Compute Optimized Instances or Accelerated computing instances that contain GPUs. You can choose any of the instances available for SageMaker training jobs. See here for the complete list of instances available: https://aws.amazon.com/sagemaker/pricing/

  2. You can expand the “Additional Options” to modify any other settings. SageMaker Studio will automatically detect the Image/Kernel you are using in your notebook and will pre-select it for you. Ensure that you have validated this selection.

  3. You can enter a new value for the number_rf_estimators if you’d like

options

Schedule Notebook as a Job for Daily Reports

To run this self contained notebook as a daily scheduled job, we can simply use the built-in functionality available in SageMaker Studio. You may need to add permissions to your SageMaker execution role. More information can be found in the documentation.

  1. Simply click the “Run on a schedule” button and set your desired schedule

schedule

Clean Up

If you scheduled this notebook, be sure to delete your schedule job definition once your experimentation is complete.

definitions

Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

This us-east-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This us-east-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This us-west-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ca-central-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This sa-east-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-west-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-west-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-west-3 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-central-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This eu-north-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-southeast-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-southeast-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-northeast-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-northeast-2 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable

This ap-south-1 badge failed to load. Check your device’s internet connectivity, otherwise the service is currently unavailable