Build a Sentiment-Aware Voice Assistant with On-Device LLMs: Set up your environment

Build a Sentiment-Aware Voice Assistant with On-Device LLMs

Log an issue

Fork and edit

Discuss on Discord

Build a Sentiment-Aware Voice Assistant with On-Device LLMs

Prepare the development environment for voice sentiment classification

Before building the voice assistant, create a project workspace and set up an isolated UV environment. This keeps project dependencies separate from your system installation and makes it easier to reproduce the steps in the rest of the Learning Path.

These instructions support Ubuntu, macOS, and Windows, with Python 3.9 or later and a working microphone.

Install required system tools first. ffmpeg is required by Whisper for audio decoding.

    

        
        

# Ubuntu
sudo apt update
sudo apt install -y ffmpeg git cmake

# macOS
brew install ffmpeg git cmake

    

        
        

# Install via WinGet
winget install -e --id Gyan.FFmpeg
winget install -e --id Git.Git
winget install -e --id Kitware.CMake

Check your Python version before continuing:

    

        
        

python3 --version

    

        
        

py -3 --version

Set up the Python environment with UV

Install UV first so the uv command is available in your terminal. UV is a fast Python package and environment manager that you’ll use throughout this Learning Path to create the project environment and install dependencies.

    

        
        

curl -LsSf https://astral.sh/uv/install.sh | sh
# Start a new shell, or source your shell rc file so `uv` is on PATH.
uv --version

    

        
        

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Open a new PowerShell window so `uv` is on PATH.
uv --version

Create and activate the project virtual environment:

    

        
        

mkdir -p ~/voice-sentiment-assistant
cd ~/voice-sentiment-assistant
uv venv .venv
source .venv/bin/activate

    

        
        

mkdir $HOME\voice-sentiment-assistant -Force
cd $HOME\voice-sentiment-assistant
uv venv .venv
.\.venv\Scripts\Activate.ps1

Keep this virtual environment activated while you complete the rest of the Learning Path.

Create a requirements.txt file for the packages used across the rest of the Learning Path:

    

        
        
gradio
openai-whisper
requests
torch
transformers
pandas
numpy
librosa
scikit-learn
onnx
onnxscript
onnxruntime

Install the dependencies into your active UV virtual environment:

    

        
        
uv pip install -r requirements.txt

This installs the libraries needed for the Gradio interface, Whisper transcription, model training, and ONNX Runtime inference. Some packages in this list are used later in the Learning Path when you optimize and export the sentiment model.

Download, build, and run llama.cpp

Next, clone the llama.cpp GitHub repository , build the local inference server, and start it. This server exposes an OpenAI-compatible API that the Python application will call later in the Learning Path.

Pre-built llama.cpp

If you prefer not to build from source, you can use pre-built binaries from llama.cpp releases . Download the package for your platform, extract it, and use the llama-server executable from that package in the run commands later in this section.

    

        
        

git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

    

        
        

git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

When the build completes, the llama-server executable should be available in the build output directory.

    

        
        
ls ./build/bin/llama-server

Note

The file verification commands in this Learning Path uses syntax for Ubuntu and macOS. If you’re on Windows, adjust the commands to use PowerShell equivalents like Test-Path .\build\bin\llama-server.exe or dir .\build\bin\. The file location remains the same across all platforms.

This Learning Path uses a quantized Gemma 3 1B instruction-tuned model served locally through llama.cpp.

The first time you run this command, llama.cpp will download the model from Hugging Face. This can take several minutes depending on your network connection.

Run the following command from the llama.cpp directory:

    

        
        

./build/bin/llama-server -hf ggml-org/gemma-3-1b-it-GGUF

    

        
        

.\build\bin\Release\llama-server.exe -hf ggml-org/gemma-3-1b-it-GGUF

Leave this terminal running while you test the application in later steps. The server listens on a local OpenAI-compatible endpoint that your app will call to generate responses.

What you’ve learned and what’s next

In this section, you:

Installed system dependencies for audio processing and building llama.cpp
Created a UV-managed Python environment with required packages
Built llama.cpp from source for local LLM inference
Started the llama-server with a quantized Gemma model

Your development environment is now ready with all tools needed for voice transcription, model training, and local LLM inference. In the next section, you’ll build the baseline voice-to-LLM pipeline using Gradio, Whisper, and llama.cpp.

Back

Build a Sentiment-Aware Voice Assistant with On-Device LLMs

Introduction

Understand voice sentiment analysis for on-device AI

Set up your environment

Build the voice-to-LLM pipeline

Train the voice sentiment classification model

Convert and quantize the model

Integrate the voice sentiment classification model

Next Steps

Build a Sentiment-Aware Voice Assistant with On-Device LLMs

Prepare the development environment for voice sentiment classification

Set up the Python environment with UV

Download, build, and run llama.cpp

What you’ve learned and what’s next