[ ]:

Compare built-in Sagemaker classification algorithms for a binary classification problem using Iris dataset

In the notebook tutorial, we build 3 classification models using HPO and then compare the AUC on test dataset on 3 deployed models

IRIS is perhaps the best known database to be found in the pattern recognition literature. Fisher’s paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. The dataset is built-in by default into R or can also be downloaded from https://archive.ics.uci.edu/ml/datasets/iris

The iris dataset, besides its historical importance, is also a fun dataset to play with since it can educate us about various ML techniques such as clustering, classification and regression, all in one dataset.

The dataset is built into any base R installation, so no download is required.

Attribute Information:

  1. sepal length in cm

  2. sepal width in cm

  3. petal length in cm

  4. petal width in cm

  5. Species of flowers: Iris setosa, Iris versicolor, Iris virginica

The prediction we will perform is Species ~ f(sepal.length,sepal.width,petal.width,petal.length)

Predicted attribute: Species of iris plant.

Load required libraries and initialize variables.

[ ]:
rm(list=ls())
library(reticulate) # be careful not to install reticulate again. since it can cause problems.
library(tidyverse)
library(pROC)
set.seed(1324)

SageMaker needs to be imported using the reticulate library. If this was performed in a local computer, we would have to make sure that Python and appropriate SageMaker libraries are installed, but inside a SageMaker notebook R kernels, these are all pre-loaded and the R user does not have to worry about installing reticulate or Python.

Session is the unique session ID associated with each SageMaker call. It remains the same throughout the execution of the program and can be recalled later to close a session or open a new session.

The bucket is the Amazon S3 bucket where we will be storing our data output. The Amazon S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.

The role is the role of the SageMaker notebook as when it was initially deployed. The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the boto regexp with appropriate full IAM role arn string(s).

[ ]:
sagemaker <- import('sagemaker')
session <- sagemaker$Session()
bucket <- session$default_bucket() # you may replace with name of your personal S3 bucket
role_arn <- sagemaker$get_execution_role()

Input the data and basic pre-processing

[ ]:
head(iris)
[ ]:
summary(iris)

In above, we see that there are 50 flowers of the setosa species, 50 flowers of the versicolor species, and 50 flowers of the virginica species.

In this case, the target variable is the Species prediction. We are trying to predict the species of the flower given its numerical measurements of Sepal length, sepal width, petal length, and petal width. Since we are trying to do binary classification, we will only take the flower species setosa and versicolor for simplicity. Also we will perform one-hot encoding on the categorical variable Species.

[ ]:
iris1 <- iris %>%
    dplyr::select(Species,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width) %>% # change order of columns such that the label column is the first column.
    dplyr::filter(Species %in% c("setosa","versicolor")) %>%                     #only select two flower for binary classification.
    dplyr::mutate(Species = as.numeric(Species) -1)                              # one-hot encoding,starting with 0 as setosa and 1 as versicolor.
[ ]:
head(iris1)

We now obtain some basic descriptive statistics of the features.

[ ]:
iris1 %>% group_by(Species) %>% summarize(mean_sepal_length = mean(Sepal.Length),
                                         mean_petal_length = mean(Petal.Length),
                                         mean_sepal_width = mean(Sepal.Width),
                                         mean_petal_width = mean(Petal.Width),
                                         )

In the summary statistics, we observe that mean sepal length is longer than mean petal length for both flowers.

Prepare for modelling

We split the train and test and validate into 70%, 15%, and 15%, using random sampling.

[ ]:
iris_train <- iris1 %>%
                    sample_frac(size = 0.7)
iris_test <- anti_join(iris1, iris_train) %>%
                  sample_frac(size = 0.5)
iris_validate <- anti_join(iris1, iris_train) %>%
                        anti_join(., iris_test)

We do a check of the summary statistics to make sure train, test, validate datasets are appropriately split and have proper class balance.

[ ]:
table(iris_train$Species)
nrow(iris_train)

We see that the class balance between 0 and 1 is almost 50% each for the binary classification. We also see that there are 70 rows in the train dataset.

[ ]:
table(iris_validate$Species)
nrow(iris_validate)

We see that the class balance in validation dataset between 0 and 1 is almost 50% each for the binary classification. We also see that there are 15 rows in the validation dataset.

[ ]:
table(iris_test$Species)
nrow(iris_test)

We see that the class balance in test dataset between 0 and 1 is almost 50% each for the binary classification. We also see that there are 15 rows in the test dataset.

Write the data to Amazon S3

Different algorithms in SageMaker will have different data formats required for training and for testing. These formats are created to make model production easier. csv is the most well known of these formats and has been used here as input in all algorithms to make it consistent.

SageMaker algorithms take in data from an Amazon S3 object and output data to an Amazon S3 object, so data has to be stored in Amazon S3 as csv,json, proto-buf or any format that is supported by the algorithm that you are going to use.

[ ]:
write_csv(iris_train, 'iris_train.csv', col_names = FALSE)
write_csv(iris_validate, 'iris_valid.csv', col_names = FALSE)
write_csv(iris_test, 'iris_test.csv', col_names = FALSE)
[ ]:
s3_train <- session$upload_data(path = 'iris_train.csv',
                                bucket = bucket,
                                key_prefix = 'data')
s3_valid <- session$upload_data(path = 'iris_valid.csv',
                                bucket = bucket,
                                key_prefix = 'data')

s3_test <- session$upload_data(path = 'iris_test.csv',
                                bucket = bucket,
                                key_prefix = 'data')
[ ]:
s3_train_input <- sagemaker$inputs$TrainingInput(s3_data = s3_train,
                                     content_type = 'text/csv')
s3_valid_input <- sagemaker$inputs$TrainingInput(s3_data = s3_valid,
                                     content_type = 'text/csv')
s3_test_input <- sagemaker$inputs$TrainingInput(s3_data = s3_test,
                                     content_type = 'text/csv')

To perform Binary classification on Tabular data, SageMaker contains following algorithms:

  • XGBoost Algorithm

  • Linear Learner Algorithm,

  • K-Nearest Neighbors (k-NN) Algorithm,

Create model 1: XGBoost model in SageMaker

Use the XGBoost built-in algorithm to build an XGBoost training container as shown in the following code example. You can automatically spot the XGBoost built-in algorithm image URI using the SageMaker image_uris.retrieve API (or the get_image_uri API if using Amazon SageMaker Python SDK version 1). If you want to ensure if the image_uris.retrieve API finds the correct URI, see Common parameters for built-in algorithms and look up XGBoost from the full list of built-in algorithm image URIs and available regions.

After specifying the XGBoost image URI, you can use the XGBoost container to construct an estimator using the SageMaker Estimator API and initiate a training job. This XGBoost built-in algorithm mode does not incorporate your own XGBoost training script and runs directly on the input datasets.

See https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html for more information.

[ ]:
container <- sagemaker$image_uris$retrieve(framework='xgboost', region= session$boto_region_name, version='latest')
cat('XGBoost Container Image URL: ', container)
[ ]:
s3_output <- paste0('s3://', bucket, '/output_xgboost')
estimator1 <- sagemaker$estimator$Estimator(image_uri = container,
                                           role = role_arn,
                                           train_instance_count = 1L,
                                           train_instance_type = 'ml.m5.4xlarge',
                                           input_mode = 'File',
                                           output_path = s3_output,
                                           output_kms_key = NULL,
                                           base_job_name = NULL,
                                           sagemaker_session = NULL)

How would an untuned model perform compared to a tuned model? Is it worth the effort? Before going deeper into XGBoost model tuning, let’s highlight the reasons why you have to tune your model. The main reason to perform hyper-parameter tuning is to increase predictability of our models by choosing our hyperparameters in a well thought manner. There are 3 ways to perform hyperparameter tuning: grid search, random search, bayesian search. Popular packages like scikit-learn use grid search and random search techniques. SageMaker uses Bayesian search techniques.

We need to choose

  • a learning objective function to optimize during model training

  • an eval_metric to use to evaluate model performance during validation

  • a set of hyperparameters and a range of values for each to use when tuning the model automatically

SageMaker XGBoost model can be tuned with many hyperparameters. The hyperparameters that have the greatest effect on optimizing the XGBoost evaluation metrics are:

  • alpha,

  • min_child_weight,

  • subsample,

  • eta,

  • num_round.

The hyperparameters that are required are num_class (the number of classes if it is a multi-class classification problem) and num_round ( the number of rounds to run the training on). All other hyperparameters are optional and will be set to default values if it is not specified by the user.

[ ]:
# check to make sure which are required and which are optional
estimator1$set_hyperparameters(eval_metric='auc',
                              objective='binary:logistic',
                              num_round = 6L
                              )

# Set Hyperparameter Ranges, check to make sure which are integer and which are continuos parameters.
hyperparameter_ranges = list('eta' = sagemaker$parameter$ContinuousParameter(0,1),
                        'min_child_weight'= sagemaker$parameter$ContinuousParameter(0,10),
                        'alpha'= sagemaker$parameter$ContinuousParameter(0,2),
                        'max_depth'= sagemaker$parameter$IntegerParameter(0L,10L))

The evaluation metric that we will use for our binary classification purpose is validation:auc, but you could use any other metric that is right for your problem. You do have to be careful to change your objective_type to point to the right direction of Maximize or Minimize according to the objective metric you have chosen.

[ ]:
# Create a hyperparamter tuner
objective_metric_name = 'validation:auc'
tuner1 <- sagemaker$tuner$HyperparameterTuner(estimator1,
                                             objective_metric_name,
                                             hyperparameter_ranges,
                                             objective_type='Maximize',
                                             max_jobs=4L,
                                             max_parallel_jobs=2L)

# Define the data channels for train and validation datasets
input_data <- list('train' = s3_train_input,
                   'validation' = s3_valid_input)

# train the tuner
tuner1$fit(inputs = input_data,
           job_name = paste('tune-xgb', format(Sys.time(), '%Y%m%d-%H-%M-%S'), sep = '-'),
           wait=TRUE)

The output of the tuning job can be checked in SageMaker if needed.

Calculate AUC for the test data on model 1

SageMaker will automatically recognize the training job with the best evaluation metric and load the hyperparameters associated with that training job when we deploy the model. One of the benefits of SageMaker is that we can easily deploy models in a different instance than the instance in which the notebook is running. So we can deploy into a more powerful instance or a less powerful instance.

[ ]:
model_endpoint1 <- tuner1$deploy(initial_instance_count = 1L,
                                   instance_type = 'ml.t2.medium')


The serializer tells SageMaker what format the model expects data to be input in.

[ ]:
model_endpoint1$serializer <- sagemaker$serializers$CSVSerializer(content_type='text/csv')

We input the iris_test dataset without the labels into the model using the predict function and check its AUC value.

[ ]:
# Prepare the test sample for input into the model
test_sample <- as.matrix(iris_test[-1])
dimnames(test_sample)[[2]] <- NULL

# Predict using the deployed model
predictions_ep <- model_endpoint1$predict(test_sample)
predictions_ep <- stringr::str_split(predictions_ep, pattern = ',', simplify = TRUE)
predictions_ep <- as.numeric(predictions_ep > 0.5)

# Add the predictions to the test dataset.
iris_predictions_ep1 <- dplyr::bind_cols(predicted_flower = predictions_ep,
                      iris_test)
iris_predictions_ep1

# Get the AUC
auc(roc(iris_predictions_ep1$predicted_flower,iris_test$Species))

Create model 2: Linear Learner in SageMaker

Linear models are supervised learning algorithms used for solving either classification or regression problems. For input, you give the model labeled examples (x, y). x is a high-dimensional vector and y is a numeric label. For binary classification problems, the label must be either 0 or 1.

The linear learner algorithm requires a data matrix, with rows representing the observations, and columns representing the dimensions of the features. It also requires an additional column that contains the labels that match the data points. At a minimum, Amazon SageMaker linear learner requires you to specify input and output data locations, and objective type (classification or regression) as arguments. The feature dimension is also required. You can specify additional parameters in the HyperParameters string map of the request body. These parameters control the optimization procedure, or specifics of the objective function that you train on. For example, the number of epochs, regularization, and loss type.

See https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html for more information.

[ ]:
container <- sagemaker$image_uris$retrieve(framework='linear-learner', region= session$boto_region_name, version='latest')
cat('Linear Learner Container Image URL: ', container)
[ ]:
s3_output <- paste0('s3://', bucket, '/output_glm')
estimator2 <- sagemaker$estimator$Estimator(image_uri = container,
                                           role = role_arn,
                                           train_instance_count = 1L,
                                           train_instance_type = 'ml.m5.4xlarge',
                                           input_mode = 'File',
                                           output_path = s3_output,
                                           output_kms_key = NULL,
                                           base_job_name = NULL,
                                           sagemaker_session = NULL)

For the text/csv input type, the first column is assumed to be the label, which is the target variable for prediction.

predictor_type is the only hyperparameter that is required to be pre-defined for tuning. The rest are optional.

Normalization, or feature scaling, is an important preprocessing step for certain loss functions that ensures the model being trained on a dataset does not become dominated by the weight of a single feature. Decision trees do not require normalization of their inputs; and since XGBoost is essentially an ensemble algorithm comprised of decision trees, it does not require normalization for the inputs either.

However, Generalized Linear Models require a normalization of their input. The Amazon SageMaker Linear Learner algorithm has a normalization option to assist with this preprocessing step. If normalization is turned on, the algorithm first goes over a small sample of the data to learn the mean value and standard deviation for each feature and for the label. Each of the features in the full dataset is then shifted to have mean of zero and scaled to have a unit standard deviation.

To make our job easier, we do not have to go back to our previous steps to do normalization. Normalization is built in as a hyper-parameter in SageMaker Linear learner algorithm. So no need to worry about normalization for the training portions.

[ ]:
estimator2$set_hyperparameters(predictor_type="binary_classifier",
                               normalize_data = TRUE)

The tunable hyperparameters for linear learner are:

  • wd

  • l1

  • learning_rate

  • mini_batch_size

  • use_bias

  • positive_example_weight_mult

Be careful to check which parameters are integers and which parameters are continuous because that is one of the common sources of errors. Also be careful to give a proper range for hyperparameters that makes sense for your problem. Training jobs can sometimes fail if the mini-batch size is too big compared to the training data available.

[ ]:
# Set Hyperparameter Ranges
hyperparameter_ranges = list('wd'  = sagemaker$parameter$ContinuousParameter(0.00001,1),
                             'l1'  = sagemaker$parameter$ContinuousParameter(0.00001,1),
                             'learning_rate'  = sagemaker$parameter$ContinuousParameter(0.00001,1),
                             'mini_batch_size'  = sagemaker$parameter$IntegerParameter(10L, 50L)
                            )

The evaluation metric we will be using in our case to compare the models will be the objective loss and is based on the validation dataset.

[ ]:
# Create a hyperparamter tuner
objective_metric_name = 'validation:objective_loss'
tuner2 <- sagemaker$tuner$HyperparameterTuner(estimator2,
                                             objective_metric_name,
                                             hyperparameter_ranges,
                                             objective_type='Minimize',
                                             max_jobs=4L,
                                             max_parallel_jobs=2L)
[ ]:
# Create a tuning job name
job_name <- paste('tune-linear', format(Sys.time(), '%Y%m%d-%H-%M-%S'), sep = '-')

# Define the data channels for train and validation datasets
input_data <- list('train' = s3_train_input,
                   'validation' = s3_valid_input)

# Train the tuner
tuner2$fit(inputs = input_data, job_name = job_name, wait=TRUE, content_type='csv') # since we are using csv files as input into the model, we need to specify content type as csv.

Calculate AUC for the test data on model 2

[ ]:
# Deploy the model into an instance of your choosing.
model_endpoint2 <- tuner2$deploy(initial_instance_count = 1L,
                                   instance_type = 'ml.t2.medium')

For inference, the linear learner algorithm supports the application/json, application/x-recordio-protobuf, and text/csv formats. For more information, https://docs.aws.amazon.com/sagemaker/latest/dg/LL-in-formats.html

[ ]:
# Specify what data formats you want the input and output of your model to look like.
model_endpoint2$serializer <- sagemaker$serializers$CSVSerializer(content_type='text/csv')
model_endpoint2$deserializer <- sagemaker$deserializers$JSONDeserializer()

In Linear Learner the output inference files are in JSON or RecordIO formats. https://docs.aws.amazon.com/sagemaker/latest/dg/LL-in-formats.html

When you make predictions on new data, the contents of the response data depends on the type of model you choose within Linear Learner. For regression (predictor_type=’regressor’), the score is the prediction produced by the model. For classification (predictor_type=’binary_classifier’ or predictor_type=’multiclass_classifier’), the model returns a score and also a predicted_label. The predicted_label is the class predicted by the model and the score measures the strength of that prediction. So, for binary classification, predicted_label is 0 or 1, and score is a single floating point number that indicates how strongly the algorithm believes that the label should be 1.

To interpret the score in classification problems, you have to consider the loss function used. If the loss hyperparameter value is logistic for binary classification or softmax_loss for multiclass classification, then the score can be interpreted as the probability of the corresponding class. These are the loss values used by the linear learner when the loss hyperparameter is set to auto as default value. But if the loss is set to hinge_loss, then the score cannot be interpreted as a probability. This is because hinge loss corresponds to a Support Vector Classifier, which does not produce probability estimates. In the current example, since our loss hyperparameter is logistic for binary classification, we can interpret it as probability of the corresponding class.

[ ]:
# Prepare the test data for input into the model
test_sample <- as.matrix(iris_test[-1])
dimnames(test_sample)[[2]] <- NULL

# Predict using the test data on the deployed model
predictions_ep <- model_endpoint2$predict(test_sample)

# Add the predictions to the test dataset.
df <- data.frame(matrix(unlist(predictions_ep$predictions), nrow=length(predictions_ep$predictions), byrow=TRUE))
df <- df %>% dplyr::rename(score = X1, predicted_label = X2)
iris_predictions_ep2 <- dplyr::bind_cols(predicted_flower = df$predicted_label,
                      iris_test)
iris_predictions_ep2

# Get the AUC
auc(roc(iris_predictions_ep2$predicted_flower,iris_test$Species))

Create model 3: KNN in SageMaker

Amazon SageMaker k-nearest neighbors (k-NN) algorithm is an index-based algorithm. It uses a non-parametric method for classification or regression. For classification problems, the algorithm queries the k points that are closest to the sample point and returns the most frequently used label of their class as the predicted label. For regression problems, the algorithm queries the k closest points to the sample point and returns the average of their feature values as the predicted value.

Training with the k-NN algorithm has three steps: sampling, dimension reduction, and index building. Sampling reduces the size of the initial dataset so that it fits into memory. For dimension reduction, the algorithm decreases the feature dimension of the data to reduce the footprint of the k-NN model in memory and inference latency. We provide two methods of dimension reduction methods: random projection and the fast Johnson-Lindenstrauss transform. Typically, you use dimension reduction for high-dimensional (d >1000) datasets to avoid the “curse of dimensionality” that troubles the statistical analysis of data that becomes sparse as dimensionality increases. The main objective of k-NN’s training is to construct the index. The index enables efficient lookups of distances between points whose values or class labels have not yet been determined and the k nearest points to use for inference.

See https://docs.aws.amazon.com/sagemaker/latest/dg/k-nearest-neighbors.html for more information.

[ ]:
container <- sagemaker$image_uris$retrieve(framework='knn', region= session$boto_region_name, version='latest')
cat('KNN Container Image URL: ', container)
[ ]:
s3_output <- paste0('s3://', bucket, '/output_knn')
estimator3 <- sagemaker$estimator$Estimator(image_uri = container,
                                           role = role_arn,
                                           train_instance_count = 1L,
                                           train_instance_type = 'ml.m5.4xlarge',
                                           input_mode = 'File',
                                           output_path = s3_output,
                                           output_kms_key = NULL,
                                           base_job_name = NULL,
                                           sagemaker_session = NULL)

Hyperparameter dimension_reduction_target should not be set when dimension_reduction_type is set to its default value, which is None. If ‘dimension_reduction_target’ is set to a certain number without setting dimension_reduction_type, then SageMaker will ask us to remove ‘dimension_reduction_target’ from the specified hyperparameters and try again. In this tutorial, we are not performing dimensionality reduction, since we only have 4 features; so dimension_reduction_type is set to its default value of None.

[ ]:
estimator3$set_hyperparameters(
                              feature_dim = 4L,
                              sample_size = 10L,
                              predictor_type = "classifier"
                                )

Amazon SageMaker k-nearest neighbor model can be tuned with the following hyperparameters: - k - sample_size

[ ]:
# Set Hyperparameter Ranges
hyperparameter_ranges = list('k' = sagemaker$parameter$IntegerParameter(1L,10L)
                            )
[ ]:
# Create a hyperparamter tuner
objective_metric_name = 'test:accuracy'
tuner3 <- sagemaker$tuner$HyperparameterTuner(estimator3,
                                             objective_metric_name,
                                             hyperparameter_ranges,
                                             objective_type='Maximize',
                                             max_jobs=2L,
                                             max_parallel_jobs=2L)
[ ]:
# Create a tuning job name
job_name <- paste('tune-knn', format(Sys.time(), '%Y%m%d-%H-%M-%S'), sep = '-')

# Define the data channels for train and validation datasets
input_data <- list('train' = s3_train_input,
                   'test' = s3_valid_input # KNN needs a test data, does not work without it.
                    )

# train the tuner
tuner3$fit(inputs = input_data, job_name = job_name, wait=TRUE, content_type='text/csv;label_size=0')

Calculate AUC for the test data on model 3

[ ]:
# Deploy the model into an instance of your choosing.
model_endpoint3 <- tuner3$deploy(initial_instance_count = 1L,
                                   instance_type = 'ml.t2.medium')

For inference, the linear learner algorithm supports the application/json, application/x-recordio-protobuf, and text/csv formats. For more information, https://docs.aws.amazon.com/sagemaker/latest/dg/LL-in-formats.html

[ ]:
# Specify what data formats you want the input and output of your model to look like.
model_endpoint3$serializer <- sagemaker$serializers$CSVSerializer(content_type='text/csv')
model_endpoint3$deserializer <- sagemaker$deserializers$JSONDeserializer()

In KNN, the input formats for inference are: - CSV - JSON - JSONLINES - RECORDIO

The output formats for inference are: - JSON - JSONLINES - Verbose JSON - Verbose RecordIO-ProtoBuf

Notice that there is no CSV output format for inference.

See https://docs.aws.amazon.com/sagemaker/latest/dg/kNN-inference-formats.html for more details.

When you make predictions on new data, the contents of the response data depends on the type of model you choose within Linear Learner. For regression (predictor_type=’regressor’), the score is the prediction produced by the model. For classification (predictor_type=’binary_classifier’ or predictor_type=’multiclass_classifier’), the model returns a score and also a predicted_label. The predicted_label is the class predicted by the model and the score measures the strength of that prediction. So, for binary classification, predicted_label is 0 or 1, and score is a single floating point number that indicates how strongly the algorithm believes that the label should be 1.

To interpret the score in classification problems, you have to consider the loss function used. If the loss hyperparameter value is logistic for binary classification or softmax_loss for multiclass classification, then the score can be interpreted as the probability of the corresponding class. These are the loss values used by the linear learner when the loss hyperparameter is set to auto as default value. But if the loss is set to hinge_loss, then the score cannot be interpreted as a probability. This is because hinge loss corresponds to a Support Vector Classifier, which does not produce probability estimates. In the current example, since our loss hyperparameter is logistic for binary classification, we can interpret it as probability of the corresponding class.

[ ]:
# Prepare the test data for input into the model
test_sample <- as.matrix(iris_test[-1])
dimnames(test_sample)[[2]] <- NULL

# Predict using the test data on the deployed model
predictions_ep <- model_endpoint3$predict(test_sample)

We see that the output is of a deserialized JSON format.

[ ]:
predictions_ep
[ ]:
typeof(predictions_ep)
[ ]:
# Add the predictions to the test dataset.
df = data.frame(predicted_flower = unlist(predictions_ep$predictions))
iris_predictions_ep2 <- dplyr::bind_cols(predicted_flower = df$predicted_flower,
                      iris_test)
iris_predictions_ep2

# Get the AUC
auc(roc(iris_predictions_ep2$predicted_flower,iris_test$Species))

Compare the AUC of 3 models for the test data

  • AUC of Sagemaker XGBoost = 1

  • AUC of Sagemaker Linear Learner = 0.83

  • AUC of Sagemaker KNN = 1

Based on the AUC metric (the higher the better), both XGBoost and KNN perform equally well and are better than the Linear Learner. We can also explore the 3 models with other binary classification metrics such as accuracy, F1 score, and misclassification error. Comparing only the AUC, in this example, we could chose either the XGBoost model or the KNN model to move onto production and close the other two. The deployed model of our choosing can be passed onto production to generate predictions of flower species given that the user only has its sepal and petal measurements. The performance of the deployed model can also be tracked in Amazon CloudWatch.

Clean up

[ ]:
model_endpoint1$delete_model()
model_endpoint2$delete_model()
model_endpoint3$delete_model()

session$delete_endpoint(model_endpoint1$endpoint)
session$delete_endpoint(model_endpoint2$endpoint)
session$delete_endpoint(model_endpoint3$endpoint)