Introduction
Run multimodal inference with MNN on Armv9
Build MNN and prepare an Omni model on Armv9
Validate text-only inference with an Omni model on Armv9
Run a vision retail shelf audit with MNN Omni
Convert spoken restock notes into structured tickets with MNN Omni
Build a single-shot multimodal restock ticket with MNN Omni
Next Steps
In this section, you’ll build MNN natively on your Armv9 Linux system and verify that the llm_demo binary can load a prebuilt Omni MNN model package. This sets up everything needed for the text, vision, and audio demos in later sections.
This section uses a native CPU-only MNN build on Armv9 — a deliberate design choice, not a fallback. The goal is to show how a compact, reproducible, deployment-friendly software stack can run directly on an Armv9 CPU without depending on a discrete GPU or separate accelerator.
At the end of this section, you’ll have:
llm_demo binaryconfig.jsonCreate a working directory under your home folder:
mkdir -p ~/mnn
cd ~/mnn
Building, running inference, and deploying all happen directly on the Armv9 device. There’s no cross-compilation involved. This keeps the toolchain simple, eliminates environment drift between build and target, and means any library or configuration issue you encounter is the same one you’d hit in production.
Building on the target also makes it straightforward to confirm that the binary, shared libraries, and model assets all resolve correctly in the same environment where you will run the model.
Clone the MNN repository:
git clone https://github.com/alibaba/MNN.git
cd MNN
Install the required build dependencies:
sudo apt update
sudo apt install -y build-essential gcc g++ cmake
Configure and build MNN with LLM, audio, and Omni support enabled:
mkdir build && cd build
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DMNN_BUILD_SHARED=ON \
-DMNN_BUILD_LLM=ON \
-DMNN_BUILD_AUDIO=ON \
-DMNN_BUILD_LLM_OMNI=ON \
-DMNN_LOW_MEMORY=ON \
-DMNN_KLEIDIAI=ON
make -j$(nproc)
The most important CMake options are:
MNN_BUILD_LLM=ON to enable LLM support required by llm_demoMNN_BUILD_AUDIO=ON to enable the audio components used by the Omni modelMNN_BUILD_LLM_OMNI=ON to enable multimodal Omni supportMNN_LOW_MEMORY=ON to prefer lower-memory runtime settings where availableMNN_KLEIDIAI=ON to enable Arm-specific optimizations through KleidiAIAmong these options, MNN_KLEIDIAI=ON is the most important Arm-specific build flag in this workflow. It enables Arm-focused optimizations through KleidiAI, making it especially relevant when you want to validate efficient local inference on Armv9 CPUs.
In this Learning Path, a CPU-first build keeps the setup simpler, reduces external dependencies, and makes the resulting workflow easier to reproduce across edge and embedded Arm systems.
Verify that the llm_demo binary was created:
ls -l ~/mnn/MNN/build/llm_demo
If your system already has another MNN installation, llm_demo can load a different libMNN.so than the one you just built. This is a common source of runtime errors.
From the build directory, inspect the runtime dependencies:
ldd ./llm_demo | grep -E "libMNN|Express|Audio|OpenCV" || true
You should see libMNN.so and libMNN_Express.so resolving from the ~/mnn/MNN/build tree. A correct result looks similar to:
libMNNAudio.so => /home/radxa/mnn/MNN/build/tools/audio/libMNNAudio.so (0x0000ffffb6d00000)
libMNN_Express.so => /home/radxa/mnn/MNN/build/express/libMNN_Express.so (0x0000ffffb6c00000)
libMNN.so => /home/radxa/mnn/MNN/build/libMNN.so (0x0000ffffb6600000)
libMNNOpenCV.so => /home/radxa/mnn/MNN/build/tools/cv/libMNNOpenCV.so (0x0000ffffb61a0000)
An incorrect result looks like this, where libMNN.so is loaded from another location:
libMNNAudio.so => /home/radxa/mnn/MNN/build/tools/audio/libMNNAudio.so (0x0000ffffb6d00000)
libMNN_Express.so => /home/radxa/mnn/MNN/build/express/libMNN_Express.so (0x0000ffffb6c00000)
libMNN.so => /usr/share/cix/lib/libMNN.so (0x0000ffffb6600000)
libMNNOpenCV.so => /home/radxa/mnn/MNN/build/tools/cv/libMNNOpenCV.so (0x0000ffffb61a0000)
If libMNN.so resolves from a different directory, update LD_LIBRARY_PATH to prefer the libraries from your local build:
export LD_LIBRARY_PATH=$HOME/mnn/MNN/build:$HOME/mnn/MNN/build/express:$HOME/mnn/MNN/build/tools/audio:$HOME/mnn/MNN/build/tools/cv:${LD_LIBRARY_PATH:-}
To make this setting persistent across terminal sessions, add it to your shell profile:
echo 'export LD_LIBRARY_PATH=$HOME/mnn/MNN/build:$HOME/mnn/MNN/build/express:$HOME/mnn/MNN/build/tools/audio:$HOME/mnn/MNN/build/tools/cv:${LD_LIBRARY_PATH:-}' >> ~/.bashrc
source ~/.bashrc
Run the check again:
ldd ./llm_demo | grep -E "libMNN|Express|Audio|OpenCV" || true
This Learning Path uses a prebuilt Omni model package that is already prepared for MNN deployment. The full package is approximately 15 GB, so ensure you have sufficient disk space and a stable internet connection before cloning.
Clone the model repository into your workspace:
cd ~/mnn
git clone https://www.modelscope.cn/MNN/Qwen2.5-Omni-7B-MNN.git
cd ~/mnn/Qwen2.5-Omni-7B-MNN
After cloning, install Git LFS and pull the large model files:
sudo apt-get install -y git-lfs
git lfs install
cd ~/mnn/Qwen2.5-Omni-7B-MNN
git lfs pull
The full model weights are approximately 15 GB. Downloading can take a while depending on your network connection.
Verify that the main model files are present and several gigabytes in size:
ls -lh ~/mnn/Qwen2.5-Omni-7B-MNN/llm.mnn ~/mnn/Qwen2.5-Omni-7B-MNN/llm.mnn.weight
If either file is only a few hundred bytes, the LFS download did not complete. Run git lfs pull again to resume it.
Run llm_demo with the model configuration to verify the binary loads correctly:
cd ~/mnn/MNN/build
./llm_demo ~/mnn/Qwen2.5-Omni-7B-MNN/config.json
The binary starts an interactive session. Type exit or press Ctrl+C to quit. If the binary loads without undefined symbol errors or missing library messages, your environment is ready for the next section.
In this section, you:
llm_demo can load the model configurationIn the next section, you’ll run a text-only baseline to verify that the core inference path works correctly before adding vision and audio inputs.