Deploy a Large Language Model (LLM) chatbot with llama.cpp using KleidiAI on Arm servers

Log an issue

Fork and edit

Discuss on Discord

About this Learning Path

Skill level:

Introductory

Reading time:

30 min

Last updated:

02 Jun 2025

Authors:	Pareena Verma, Arm Jason Andrews, Arm Zach Lasiuk
Arm IP:	Neoverse
Tags:	ML Linux LLM GenAI Python Demo Hugging Face

Authors:

Arm IP:

Tags:

Linux

LLM

GenAI

Python

Demo

Hugging Face

This is an introductory topic for developers interested in running LLMs on Arm-based servers.

Upon completion of this learning path, you will be able to:

Before starting, you will need the following:

An AWS Graviton4 r8g.16xlarge instance to test Arm performance optimizations, or any Arm based instance from a cloud service provider or an on-premise Arm server.