Generate audio with Stable Audio Open Small using ExecuTorch: Build and run on Android

Generate audio with Stable Audio Open Small using ExecuTorch

Log an issue

Fork and edit

Discuss on Discord

Generate audio with Stable Audio Open Small using ExecuTorch

Overview

In this section, you will cross-compile the audio generation application for Android and run it on an Android device with an Arm CPU.

Note

These steps assume an Arm64 Android device with sufficient memory (8 GB recommended).

Set up the environment

Start a fresh virtual environment to avoid dependency conflicts from earlier steps.

    

        
        
cd $WORKSPACE/ML-examples/kleidiai-examples/audiogen-et/
python3.10 -m venv android-venv
source android-venv/bin/activate

If you haven’t already installed ExecuTorch, install it now:

    

        
        
pip install executorch==1.0.0

Set model path

Set the EXECUTORCH_MODELS_PATH environment variable to the directory containing your exported ExecuTorch models:

    

        
        
export EXECUTORCH_MODELS_PATH=$WORKSPACE/ML-examples/kleidiai-examples/audiogen-et

Download and install Android NDK

Download Android NDK r27c for your host platform. Navigate to the app directory first:

    

        
        

wget https://dl.google.com/android/repository/android-ndk-r27c-linux.zip
unzip android-ndk-r27c-linux.zip

    

        
        

curl https://dl.google.com/android/repository/android-ndk-r27c-darwin.zip -o android-ndk-r27c-darwin.zip
unzip android-ndk-r27c-darwin.zip

Set the NDK_PATH environment variable to the extracted NDK directory:

    

        
        
export NDK_PATH=$(pwd)/android-ndk-r27c

If you extracted the NDK to a different directory, update NDK_PATH accordingly.

Build the application for Android

Create a build directory and navigate into it:

    

        
        
mkdir android-build && cd android-build

Run CMake with the Android toolchain configuration:

    

        
        
cmake -DCMAKE_TOOLCHAIN_FILE=$NDK_PATH/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a ..

This command configures the build to use the Android NDK toolchain and target the arm64-v8a architecture.

Build the application:

The build process creates an audiogen executable for Android in the android-build directory.

Transfer files to Android device

Ensure adb is installed and your Android device is connected by running adb devices.

Use adb to transfer the application and model files to your Android device.

Create a directory on the device:

    

        
        
adb shell mkdir -p /data/local/tmp/app

Note

The /data/local/tmp directory is writable without root access and commonly used for testing native binaries.

Push the application executable:

    

        
        
adb push audiogen /data/local/tmp/app

Push the three model files:

    

        
        
adb push $EXECUTORCH_MODELS_PATH/dit_model.pte /data/local/tmp/app
adb push $EXECUTORCH_MODELS_PATH/autoencoder_model.pte /data/local/tmp/app
adb push $EXECUTORCH_MODELS_PATH/conditioners_model.pte /data/local/tmp/app

Download and transfer the tokenizer model

Download the SentencePiece tokenizer model:

    

        
        

wget https://huggingface.co/google-t5/t5-base/resolve/main/spiece.model
adb push spiece.model /data/local/tmp/app

    

        
        

curl -L https://huggingface.co/google-t5/t5-base/resolve/main/spiece.model -o spiece.model
adb push spiece.model /data/local/tmp/app

Run the application on Android

Connect to your Android device using adb:

    

        
        
adb shell

Navigate to the application directory:

    

        
        
cd /data/local/tmp/app

Run the audiogen application with an example prompt:

    

        
        
./audiogen -m . -p "warm arpeggios on house beats 120BPM with drums effect" -t 4

The arguments are:

Model Path (-m): Directory containing the models and tokenizer (. for current directory)
Prompt (-p): Text description of the desired audio
CPU Threads (-t): Number of CPU threads to use (adjust based on your device)

The application generates a short audio sample based on your prompt.

Retrieve the generated audio

Exit the adb shell by typing exit, then pull the generated audio file from the device:

    

        
        
adb pull /data/local/tmp/app/warm_arpeggios_on_house_beats_120bpm_with_drums_effect_99.wav

Play the audio file on your development machine or transfer it to your Android phone. Experiment with different prompts to generate various audio samples on your Android device. The application uses ExecuTorch with XNNPack and Arm KleidiAI optimizations to deliver efficient inference on Arm CPUs.

Try different prompts such as:

“ambient piano melody with soft strings 80BPM”
“energetic drum and bass beat 170BPM”
“guitar riff with distortion 110BPM rock style”

The Stable Audio Open Small model works best with clear, descriptive prompts that include musical elements, tempo, and atmosphere.

Summary

You’ve successfully deployed the Stable Audio Open Small model on Android using ExecuTorch. Throughout this Learning Path, you converted the three model submodules (Conditioners, DiT, and AutoEncoder) to ExecuTorch format, built an optimized audio generation application, and ran it on an Arm-based Android device. The application leverages ExecuTorch with XNNPack and Arm KleidiAI to deliver efficient on-device audio generation, enabling real-time text-to-audio synthesis without requiring cloud connectivity. Integrate this capability into mobile applications or continue exploring audio generation with different prompts and configurations.

Back

Generate audio with Stable Audio Open Small using ExecuTorch

Introduction

Set up your development environment

Download the Stable Audio Open Small model

Convert the model to ExecuTorch format

Build and run on macOS

Build and run on Android

Next Steps

Generate audio with Stable Audio Open Small using ExecuTorch

Overview

Set up the environment

Set model path

Download and install Android NDK

Build the application for Android

Transfer files to Android device

Download and transfer the tokenizer model

Run the application on Android

Retrieve the generated audio

Summary