What you've learned

You should now know how to:

  • Use Docker to run Raspberry Pi OS on an Arm Linux server.
  • Compile a Large Language Model (LLM) using ExecuTorch.
  • Deploy the Llama 3 model on an edge device.
  • Describe how to run Llama 3 on a Raspberry Pi 5 using ExecuTorch.
  • Describe techniques for running large language models in an embedded environment.

Knowledge Check

What is the benefit of quantization?

What quantization scheme does Llama require to run on an embedded device such as the Raspberry Pi 5?

Dynamic quantization happens at runtime.


Back
Next