LiteRT provides a standalone performance measurement utility called benchmark_model for evaluating the performance of LiteRT models.
In this section, you will build two versions of the benchmark tool:
This comparison demonstrates the performance gains provided by SME2 acceleration.
First, clone the LiteRT repository:
cd $WORKSPACE
git clone https://github.com/google-ai-edge/LiteRT.git
Because LiteRT integrates KleidiAI through XNNPACK (an open-source library providing highly optimized neural-network operators), you must build LiteRT from source to enable SME2 micro-kernels.
Next, set up your Android build environment using Docker on your Linux development machine. Google provides a Dockerfile that installs the toolchain needed for TensorFlow Lite (TFLite)/LiteRT Android builds.
Download the Dockerfile:
wget https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/tools/tflite-android.Dockerfile
Build the Docker image:
docker build . -t tflite-builder -f tflite-android.Dockerfile
The Docker image includes Bazel, Android Native Development Kit (NDK), CMake, toolchains, and Python required for cross-compiling Android binaries.
Now, install Android Software Development Kit (SDK) and NDK components inside the container.
Launch the Docker container:
docker run -it -v $PWD:/host_dir tflite-builder bash
Install Android platform tools:
sdkmanager \
"build-tools;${ANDROID_BUILD_TOOLS_VERSION}" \
"platform-tools" \
"platforms;android-${ANDROID_API_LEVEL}"
Configure LiteRT build options inside your running container:
cd /host_dir/LiteRT
./configure
Use default values for all prompts except when asked:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]
Type y and press Enter.
LiteRT’s configuration script will detect SDK and NDK paths, set toolchain versions, configure the Android Application Binary Interface (ABI) to arm64-v8a, and initialize Bazel workspace rules.
Now, you can build the benchmark tool with KleidiAI and SME2 enabled.
Enable XNNPACK, quantization paths, and SME2 acceleration:
export BENCHMARK_TOOL_PATH="litert/tools:benchmark_model"
export XNNPACK_OPTIONS="--define tflite_with_xnnpack=true \
--define=tflite_with_xnnpack_qs8=true \
--define=tflite_with_xnnpack_qu8=true \
--define=tflite_with_xnnpack_dynamic_fully_connected=true \
--define=xnn_enable_arm_sme=true \
--define=xnn_enable_arm_sme2=true \
--define=xnn_enable_kleidiai=true"
Build for Android:
bazel build -c opt --config=android_arm64 \
${XNNPACK_OPTIONS} "${BENCHMARK_TOOL_PATH}" \
--repo_env=HERMETIC_PYTHON_VERSION=3.12
This build enables the KleidiAI and SME2 micro-kernels integrated into XNNPACK and produces an Android binary at:
bazel-bin/litert/tools/benchmark_model
To compare the performance of the KleidiAI SME2 implementation against XNNPACK’s original implementation, build another version of the LiteRT benchmark tool without KleidiAI and SME2 enabled.
Set the build options to disable SME2 and KleidiAI:
export BENCHMARK_TOOL_PATH="litert/tools:benchmark_model"
export XNNPACK_OPTIONS="--define tflite_with_xnnpack=true \
--define=tflite_with_xnnpack_qs8=true \
--define=tflite_with_xnnpack_qu8=true \
--define=tflite_with_xnnpack_dynamic_fully_connected=true \
--define=xnn_enable_arm_sme=false \
--define=xnn_enable_arm_sme2=false \
--define=xnn_enable_kleidiai=false"
Then rebuild:
bazel build -c opt --config=android_arm64 \
${XNNPACK_OPTIONS} "${BENCHMARK_TOOL_PATH}" \
--repo_env=HERMETIC_PYTHON_VERSION=3.12
This build of the benchmark_model disables all SME2 micro-kernels and forces fallback to XNNPACK’s NEON or SVE2 kernels.
You can then use Android Debug Bridge (ADB) to push the benchmark tool to your Android device:
adb push bazel-bin/litert/tools/benchmark_model /data/local/tmp/
adb shell chmod +x /data/local/tmp/benchmark_model
You have now built both versions of the LiteRT benchmark tool. You are ready to benchmark and compare SME2-accelerated and baseline performance on your Arm-based Android device.