What you've learned

You should now know how to:

  • Install the prerequisites for cross-compiling new inference engines for Android.
  • Run LLM inference on an Android device with the Gemma 2B model using the Google AI Edge's MediaPipe framework.
  • Benchmark LLM inference speed with and without the KleidiAI-enhanced Arm i8mm processor feature.

Knowledge Check

The KleidiAI performance improvements are noticeable in which type of benchmarks?

What is MediaPipe?

Does Android NDK r21 include support for i8mm instructions?


Back
Next