There are two ways to run SVE instructions if you don’t have SVE capable hardware: QEMU and the Arm Instruction Emulator (ArmIE). Each of these is covered below.
The steps shown are for an Arm v8-A system with Ubuntu 22.04 and no SVE support.
The example code adds two 127 double-precision arrays.
Use a text editor to copy the code below and save it in a file named sve_add.c
Compile the applications using the commands shown:
Run the application on the Arm Linux host:
An illegal instruction message confirms the host does not support SVE.
You can run applications containing SVE instructions without SVE capable hardware using QEMU , a generic and open source machine emulator and virtualizer.
Install qemu-user
to run the example on processors which do not support SVE:
Run the example application with a vector length of 256 bits, note that the vector length is specified in bytes rather than bits:
The application now runs and prints the expected message.
You can also run the application containing SVE instructions using the the Arm Instruction Emulator.
Download and install the Arm Instruction Emulator (see installation instructions ) on any Arm v8-A system. The Arm Instruction Emulator intercepts and emulates unsupported SVE instructions. It also support plugins for application analysis.
The Arm Instruction Emulator has been deprecated. It is still available for download, but there is no active development.
Now run the application with ArmIE as shown:
Armie requires the -msve-vector-bits
parameter to specify the SVE vector length.
Armie has plugins you can use to analyze your application.
The libinscount_emulated.so
plugin reports the amount of executed instructions. Run the command below and check the output:
Increasing the vector width from 256 to 512 divides the amount of emulated SVE instructions by two as shown:
To get more information on which instruction are executed, the libopcodes_emulated.so
plugin can be used as shown:
Undecoded instructions are stored in a file with the format undecoded.APP.PID.log
. To decode them, use the script enc2instr.py
, provided with Armie.
This script requires llvm-mc and python 2.7. Install, using the command shown:
This example command processes the results:
Which gives the following output:
In this list, see SVE instructions identified in the previous tutorial Compile for SVE . In the main loop, they are executed 16 times to compute the addition of 127 array elements (16 batches of 512-bit SVE instructions).
The RoI allows to limit the amount of data generated by tracing. Add the following macros as shown in the code snippet below:
Rebuild the application and add the options -a -roi
to Armie to filter data for the RoI:
Using libmemtrace_sve_512.so
and libinstrace_emulated.so
will generate two data files instrace.APP.PID.log
and sve-memtrace.APP.PID.log
. instrace.APP.PID.log
traces all instructions executed. sve-memtrace.APP.PID.log
only captures information about SVE memory accesses.
To filter data of interest, run the following commands:
The output will look like this:
You can identify 16 batches of 512-bit SVE load and stores. All of them are unpredicated and handle 64 bytes, except the last iteration which handles 56 bytes to compute elements indexes [120-126].