About this Learning Path

Who is this for?

This is an introductory topic for software developers and AI engineers interested in learning how to use the vLLM library on Arm servers.

What will you learn?

Upon completion of this Learning Path, you will be able to:

  • Build vLLM from source on an Arm server.
  • Use a Qwen LLM from Hugging Face.
  • Run local batch inference using vLLM.
  • Create and interact with an OpenAI-compatible server provided by vLLM on your Arm server.

Prerequisites

Before starting, you will need the following:

  • An Arm-based Linux instance from a cloud service provider, or a local Arm Linux computer running Ubuntu 24.04 with at least 8 CPUs, 16 GB RAM, and 50 GB of disk storage.
  • A system that includes support for BFloat16.
Next