Profile the Performance of AI and ML Mobile Applications on Arm: Profiling the Neural Network

Profile the Performance of AI and ML Mobile Applications on Arm

Log an issue

Fork and edit

Discuss on Discord

Profile the Performance of AI and ML Mobile Applications on Arm

Tools that you can use

App profilers provide a good overall view of performance, but you might want to look inside the model and identify bottlenecks within the network. The network is often where the bulk of the bottlenecks lie, so it warrants closer analysis.

With general profilers this is hard to do, as there needs to be annotation inside the ML framework code to retrieve the information. It is a complex task to write the profiling annotation throughout the framework, so it is easier to use tools from a framework or inference engine that already has the required instrumentation.

Depending on the model you use, your choice of tools will vary. For example, if you are using LiteRT (formerly TensorFlow Lite), Arm provides the Arm NN delegate that you can run with the model running on Linux or Android, CPU or GPU.

Arm NN in turn provides a tool called ExecuteNetwork that can run the model and provide layer timings.

If you are using PyTorch, you will probably use ExecuTorch, which is the on-device inference runtime for your Android phone. ExecuTorch has a profiler available alongside it.

Back

Profile the Performance of AI and ML Mobile Applications on Arm

Introduction

Why should you profile your ML application?

Profile your application with Streamline

Memory Profiling with Android Studio

Profiling the Neural Network

ML Profiling of a LiteRT model with ExecuteNetwork

ML Profiling of an ExecuTorch model

Next Steps

Profile the Performance of AI and ML Mobile Applications on Arm

Tools that you can use