Quick Start - Using @step Decorated Step with Classic TrainingStep

This notebook’s CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

We’re introducing a low-code experience for data scientists to convert the Machine Learning (ML) development code into repeatable and reusable workflow steps of Amazon SageMaker Pipelines. This sample notebook is a quick introduction to this capability with dummy Python functions wrapped as pipeline steps. It demonstrates how this capability works with the classic step types e.g. TrainingStep. To be more specific, the pipeline in this notebook shows how to pass an output property (i.e. S3ModelArtifacts) of a classic TrainingStep to a dummy evaluate function decorated by @step.

Note this notebook can only run on either Python 3.8 or Python 3.10. Otherwise, you will get an error message prompting you to provide an image_uri when defining a step.

Install the dependencies and setup configuration file path

If you run the notebook from a local IDE outside of SageMaker, please follow the “AWS CLI Prerequisites” section of the Set Up Amazon SageMaker Prerequisites to set up AWS credentials.

[ ]:

!pip install -r ./requirements.txt

[ ]:

import os

# Set path to config file
os.environ["SAGEMAKER_USER_CONFIG_OVERRIDE"] = os.getcwd()

Define pipeline steps

[ ]:

%%writefile dummy_train.py
import json
import os


if __name__ == "__main__":
    model_output_directory = os.path.join("/opt/ml/model", "model.json")
    with open(model_output_directory, "w") as f:
        json.dump({"rmse": 5.0}, f)

[ ]:

import sagemaker
from sagemaker.sklearn import SKLearn
from sagemaker.workflow.steps import TrainingStep

# Note: sagemaker.get_execution_role does not work outside sagemaker
role = sagemaker.get_execution_role()
sklearn_train = SKLearn(
    framework_version="1.2-1",
    entry_point="dummy_train.py",
    instance_type="ml.m5.large",
    keep_alive_period_in_seconds=600,
    role=role,
)

step_train = TrainingStep(
    name="my-train",
    display_name="TrainingStep",
    description="description for Training step",
    estimator=sklearn_train,
)

[ ]:

from sagemaker.workflow.function_step import step

evaluate_func_step_name = "my-evaluate"


@step(name=evaluate_func_step_name)
def evaluate(model_path: str):
    print("model_path: ", model_path)
    return model_path

[ ]:

from sagemaker.workflow.pipeline import Pipeline

evaluation_result = evaluate(step_train.properties.ModelArtifacts.S3ModelArtifacts)

pipeline_name = "ClassicTraining-StepDecorator"
pipeline = Pipeline(
    name=pipeline_name,
    steps=[evaluation_result],
)

Create the pipeline and run pipeline execution

[ ]:

pipeline.upsert(role_arn=role)

[ ]:

execution = pipeline.start(parallelism_config=dict(MaxParallelExecutionSteps=10))

[ ]:

execution.wait()

[ ]:

execution.list_steps()

[ ]:

execution.result(step_name=evaluate_func_step_name)

Clean up resources

[ ]:

pipeline.delete()

Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

[ ]: