About this Learning Path

Who is this for?

This Learning Path is for developers and ML engineers who want to deploy Arcee's AFM-4.5B small language model on Google Cloud Axion instances using Llama.cpp.

What will you learn?

Upon completion of this Learning Path, you will be able to:

  • Launch an Arm-based Compute Engine instance on Google Cloud Axion
  • Build and install Llama.cpp from source
  • Download and quantize the AFM-4.5B model from Hugging Face
  • Run inference on the quantized model using Llama.cpp
  • Evaluate model quality by measuring perplexity

Prerequisites

Before starting, you will need the following:

  • A Google Cloud account with permission to launch Axion (c4a-standard-16 or larger) instances
  • Basic familiarity with Linux and SSH
Next