What you've learned

You should now know how to:

  • Set up an ExecuTorch development environment.
  • Describe how ExecuTorch uses XNNPACK kernels to accelerate performance on Arm-based platforms.
  • Describe how 4-bit groupwise PTQ quantization reduces model size without significantly sacrificing model accuracy.
  • Build and run Llama models using ExecuTorch on your development machine.
  • Build and run an Android Chat app with different Llama models using ExecuTorch on an Arm-based smartphone.

Knowledge Check

What is ExecuTorch?

What is Llama?

Which quantization scheme did you use for an Android app?