About this Learning Path

Who is this for?

This is an introductory topic for developers interested in running LLMs on Arm-based servers.

What will you learn?

Upon completion of this learning path, you will be able to:

  • Download and build llama.cpp on your Arm server.
  • Download a pre-quantized Llama 3.1 model from Hugging Face.
  • Run the pre-quantized model on your Arm CPU and measure the performance.

Prerequisites

Before starting, you will need the following:

  • An AWS Graviton4 r8g.16xlarge instance to test Arm performance optimizations, or any Arm based instance from a cloud service provider or an on-premise Arm server.
Next