Reinforcement Learning

Get started with RL
- Mountain Car with Amazon SageMaker RL
PatientMountainCar
PatientContinuousMountainCar
- Imports
- Setup S3 bucket
- Define Variables
- Configure settings
- Create an IAM role
- Install docker for
local
mode - Plot metrics for training job
- Visualize the rendered gifs
- Load checkpointed model
- Run the evaluation step
- Visualize the output
- Clean up endpoint
- Cart-pole Balancing Model with Amazon SageMaker and Ray
- Roboschool simulations of physical robotics with Amazon SageMaker
Cart pole
A cart pole simulation is the act of balancing a broom upright by balancing it on your hand. The broom is the “pole” and your hand is replaced with a “cart” moving back and forth on a linear track. This simplified example works in 2 dimensions, so the cart can only move in a line back and forth, and the pole can only fall forwards or backwards, not to the sides. These examples use PyTorch or TensorFlow and SageMaker RL to solve a cart pole problem.
- Cart-pole Balancing Model with Amazon SageMaker and Coach library
- Training Batch Reinforcement Learning Policies with Amazon SageMaker RL and Coach library
- Cart-pole Balancing Model with Amazon SageMaker on SageMaker Managed Spot Training
Contextual bandits
Explore a number of actions with contextual bandits algorithms in SageMaker.
- Contextual Bandits with Parametric Actions – Experimentation Mode
- What is Experimentation Mode?
- Imports
- Setup S3 bucket
- Configure where training happens
- Create an IAM role
- Simulation environment (from MovieLens data)
- Create a SageMaker model for inference
- 1. Batch Transform
- Generating test dataset for inference
- Download batch transform results
- 2. Real-time inference
- Clean Up endpoint
- Contextual Bandits with Amazon SageMaker RL
Roboschool
Roboschool is a physics simulator that is commonly used to train RL policies for robotic systems.
- Tune hyperparameters for your RL training job
- Training Roboschool agents using distributed RL training across multiple nodes with Amazon SageMaker
- Roboschool simulations training with stable baselines on AWS SageMaker RL
Use cases
Autoscaling
This example demonstrates how to use RL to address scaling a production service by adding and removing resources (e.g. servers or EC2 instances) in reaction to a dynamic load.
- Autoscaling a service with Amazon SageMaker
- Problem Statement
- Using Amazon SageMaker for RL
- Pre-requisites
- Set up the environment
- Configure the presets for RL algorithm
- Write the Training Code
- Train the RL model using the Python SDK Script mode
- Store intermediate training output and model checkpoints
- Visualization
- Evaluation of RL models
- Hosting
Energy
Training an RL algorithm in a real HVAC system can take time to converge as well as potentially lead to hazardous settings as the agent explores its state space. This example uses the EnergyPlus simulator to showcase how you can train an HVAC optimization RL model with Amazon SageMaker.
- HVAC with Amazon SageMaker RL
Game play
Use RL to train an agent to play in a Unity3D environment.
Game server
A reinforcement learning-based system using SageMaker Autopilot and SageMaker RL that learns to allocate resources in response to player usage patterns.
Knapsack problem
Use SageMaker RL to address a canonical operations research problem, aka, a knapsack problem.
Object tracker
Use RL to train a TurtleBot object tracker using Amazon SageMaker Reinforcement Learning and AWS RoboMaker.
Network compression
Network to network compression via policy gradient reinforcement learning.
Portfolio management
Use SageMaker RL to manage a stock portfolio by continuously reallocating several stocks.
- Portfolio Management with Amazon SageMaker RL
- Problem Statement
- Dataset
- Using reinforcement learning on Amazon SageMaker RL
- Pre-requisites
- Set up the environment
- Configure the presets for RL algorithm
- Write the Training Code
- Train the RL model using the Python SDK Script mode
- Store intermediate training output and model checkpoints
- Visualization
- Load the checkpointed models for evaluation
- Risk Disclaimer (for live-trading)
Resource allocation
Solve resource allocation problems with SageMaker RL.
- Solving Bin Packing Problem with Amazon SageMaker RL
- Solving Multi-Period Newsvendor Problem with Amazon SageMaker RL
- Solving Vehicle Routing Problem with Amazon SageMaker RL
- NOTE: This notebook only works with an older version of Ray and Tensorflow. We are not planning to upgrade this notebook to use the latest Ray at the moment.
- Problem Statement
- Using Amazon SageMaker for RL
- Pre-requisites
- Set up the environment
- Write the training code
- Train the RL model using the Python SDK Script mode
- Visualization
- Training Results
Tic-tac-toe
Play global thermonuclear war with a computer.
Stock Trading
Try stock trading with SageMaker RL.
- Stock Trading with Amazon SageMaker RL
- Problem Statement
- Dataset
- Using Amazon SageMaker for RL
- Pre-requisites
- Set up the environment
- Configure the presets for RL algorithm
- Write the Training Code
- Train the RL model using the Python SDK Script mode
- Store intermediate training output and model checkpoints
- Visualization
- Load the checkpointed models for evaluation
- Risk Disclaimer (for live-trading)
Traveling salesman problem
Use SageMaker RL to solve this classic problem with a twist: a restaurant delivery service on a 2D gridworld.
- Traveling Salesman Problem with Reinforcement Learning
- Description of Problem
- Why Reinforcement Learning?
- Easy Version of TSP
- Using AWS SageMaker for RL
- Medium version of TSP
- Using AWS SageMaker for RL
- Visualize, Compare with Baseline and Evaluate
- Vehicle Routing Problem with Reinforcement Learning
- Using AWS SageMaker RL
- Visualize, Compare with Baseline and Evaluate