Who is this for?
This is an introductory topic for developers who want to learn how to use KleidiAI to accelerate the execution of Generative AI workloads on hardware.
What will you learn?
Upon completion of this learning path, you will be able to:
- Describe how basic math operations power Large Language Models.
- Describe how the KleidiAI micro-kernels speed up Generative AI inference performance.
- Run a basic C++ matrix multiplication example to showcase the speedup that KleidiAI micro-kernels can deliver.
Prerequisites
Before starting, you will need the following:
- An Arm-based Linux machine that implements the Int8 Matrix Multiplication (i8mm) architecture feature. The example in this Learning Path is run on an AWS Graviton 3 instance. Instructions on setting up an Arm-based server are
found here
.
- A basic understanding of linear algebra terminology, such as dot product and matrix multiplication.