About this Learning Path

Skill level:

Advanced

Reading time:

1 hr

Last updated:

02 Jun 2025

Authors:	Pareena Verma, Arm Joe Stech, Arm Adnan AlSinan, Arm
Arm IP:	Cortex-A
Tags:	ML Linux Java MediaPipe Android SDK Android NDK Bazel XNNPACK Hugging Face

Authors:

Arm IP:

Tags:

Linux

Java

MediaPipe

Android SDK

Android NDK

Bazel

XNNPACK

Hugging Face

This is an advanced topic for Android developers who want to efficiently run LLMs on-device.

Upon completion of this Learning Path, you will be able to:

Install the prerequisites for cross-compiling new inference engines for Android.
Run LLM inference on an Android device with the Gemma 2B model using the Google AI Edge's MediaPipe framework.
Benchmark LLM inference speed with and without the KleidiAI-enhanced Arm i8mm processor feature.

Before starting, you will need the following:

An x86_64 Linux machine running Ubuntu with approximately 500 MB of free space, or a docker daemon that can build and run a provided x86_64 Dockerfile.
An Android phone with support for i8mm (tested on Google Pixel 8 Pro).