Who is this for?
This Learning Path is for developers and ML engineers who want to deploy Arcee's AFM-4.5B small language model on AWS Graviton4 instances using Llama.cpp.
What will you learn?
Upon completion of this Learning Path, you will be able to:
- Launch an Arm-based EC2 instance on AWS Graviton4
- Build and install Llama.cpp from source
- Download and quantize the AFM-4.5B model from Hugging Face
- Run inference on the quantized model using Llama.cpp
- Evaluate model quality by measuring perplexity
Prerequisites
Before starting, you will need the following:
- An
AWS account
with permission to launch Graviton4 (
c8g.4xlarge
or larger) instances - Basic familiarity with Linux and SSH