About this Learning Path

Who is this for?

This is an advanced topic for ML developers who want to reduce latency and memory bandwidth by exporting INT8 models to the `.vgf` file format using the ExecuTorch Arm backend.

What will you learn?

Upon completion of this Learning Path, you will be able to:

  • Explain when to use post-training quantization (PTQ) vs quantization-aware training (QAT)
  • Prepare and quantize a PyTorch model using TorchAO PT2E quantization APIs
  • Export the quantized model to TOSA and generate a model artifact with the ExecuTorch Arm backend
  • Validate the exported graph by visualizing it using Google's Model Explorer

Prerequisites

Before starting, you will need the following:

  • Basic PyTorch model training and evaluation experience
  • A development machine with Python 3.10+ and PyTorch installed that runs ExecuTorch
Next