About this Learning Path

Who is this for?

This is an introductory topic for software developers and AI engineers interested in learning how to use a vLLM (Virtual Large Language Model) on Arm servers.

What will you learn?

Upon completion of this learning path, you will be able to:

  • Build a vLLM from source on an Arm server.
  • Download a Qwen LLM from Hugging Face.
  • Run local batch inference using a vLLM.
  • Create and interact with an OpenAI-compatible server provided by a vLLM on your Arm server.

Prerequisites

Before starting, you will need the following:

  • An Arm-based instance from a cloud service provider, or a local Arm Linux computer with at least 8 CPUs and 16 GB RAM.
Next