Benchmarking

The Voice Assistant application also provides a benchmark mode so you can easily test out the performance of an LLM model with a sample number of input and output tokens.

Image Alt Text:welcome image alt-textWelcome Screen

Tap Benchmark to navigate to benchmark screen.

Image Alt Text:Benchmark image alt-textBenchmark Screen

Benchmark controls

You can use application controls to enable extra functionality or gather performance data.

SettingDefaultDescription
Input tokens128Number of prompt (input) tokens fed to the model before generation starts.
Output tokens128Number of new tokens the model should generate after the prompt.
Threads4Number of CPU threads used for inference.
Iterations5Number of measured benchmark runs to collect stable, averaged measurements.
Warmup1Number of warmup iterations which are not counted in benchmarking, these eliminate one-time overheads before measuring.

To deep dive into more specific performance, you can build the Voice Assistant modules individually and run benchmarks on your Android device.

Back
Next