About this Learning Path

Who is this for?

This is an introductory topic for software developers and AI engineers interested in learning how to use the vLLM library on Arm servers.

What will you learn?

Upon completion of this learning path, you will be able to:

  • Build vLLM from source on an Arm server.
  • Download a Qwen LLM from Hugging Face.
  • Run local batch inference using vLLM.
  • Create and interact with an OpenAI-compatible server provided by vLLM on your Arm server.

Prerequisites

Before starting, you will need the following:

  • An Arm-based instance from a cloud service provider, or a local Arm Linux computer with at least 8 CPUs and 16 GB RAM.
Next