Examples on how to use SageMaker Debugger.

Profiling System Bottlenecks and Framework Operators

Debugger provides the following profile features:

  • Monitoring system bottlenecks – Monitor system resource utilization rate, such as CPU, GPU, memories, network, and data I/O metrics. This is a framework and model agnostic feature and available for any training jobs in SageMaker.

  • Profiling deep learning framework operations – Profile deep learning operations of the TensorFlow and PyTorch frameworks, such as step durations, data loaders, forward and backward operations, Python profiling metrics, and framework-specific metrics.