About this Learning Path

Who is this for?

This is an advanced topic for Android developers who want to efficiently run LLMs on-device.

What will you learn?

Upon completion of this learning path, you will be able to:

  • Install the prerequisites for cross-compiling new inference engines for Android.
  • Run LLM inference on an Android device with the Gemma 2B model using the Google AI Edge's MediaPipe framework.
  • Benchmark LLM inference speed with and without the KleidiAI-enhanced Arm i8mm processor feature.

Prerequisites

Before starting, you will need the following:

  • An x86_64 Linux machine running Ubuntu with approximately 500 MB of free space, or a docker daemon that can build and run a provided x86_64 Dockerfile.
  • An Android phone with support for i8mm (tested on Google Pixel 8 Pro).
Next