Build ONNX Runtime Generate() API

Run Phi-3 on Windows on Arm using ONNX Runtime

Log an issue

Fork and edit

Discuss on Discord

Run Phi-3 on Windows on Arm using ONNX Runtime

Build the ONNX Runtime Generate() API for Windows on Arm

The Generate() API in ONNX Runtime is designed for text generation tasks using models like Phi-3. It implements the generative AI loop for ONNX models, including:

Pre- and post-processing.
Inference with ONNX Runtime (including logits processing).
Search and sampling.
KV cache management.

Learning Tip

You can learn more about this area by reading the ONNX Runtime Generate() API documentation .

In this section, you’ll build the Generate() API from source.

Clone the onnxruntime-genai repository

From your Windows Developer Command Prompt for Visual Studio, clone the repository and checkout the following tested commit:

    

        
        
cd C:\Users\%USERNAME%
cd repos\lp
git clone https://github.com/microsoft/onnxruntime-genai
cd onnxruntime-genai
git checkout b2e8176c99473afb726d364454dc827d2181cbb2

Note

You might be able to use later commits. These steps have been tested with the commit b2e8176c99473afb726d364454dc827d2181cbb2.

Build for Windows on Arm

The build script uses a –config argument, which supports the following options:

Release builds release build.
Debug builds binaries with debug symbols.
RelWithDebInfo builds release binaries with debug info.

To build the Release variant of the ONNX Runtime Generate() API:

    

        
        
pip install requests
python build.py --config Release --skip_tests

Verify the output

When the build is complete, confirm the ONNX Runtime Generate() API Dynamically Linked Library has been created:

    

        
        dir build\Windows\Release\Release\onnxruntime-genai.dll

Back

Run Phi-3 on Windows on Arm using ONNX Runtime

Introduction

Set up your Environment

Build ONNX Runtime

Build ONNX Runtime Generate() API

Run Phi3 Model

Next Steps

Run Phi-3 on Windows on Arm using ONNX Runtime

Build the ONNX Runtime Generate() API for Windows on Arm

Clone the onnxruntime-genai repository

Build for Windows on Arm

Verify the output