Below are example commands to compile an application with support for SVE instructions using the GNU Toolchain:
For GCC, use the following command:
gcc -march=armv8-a+sve myapp.c -o myapp_c.out
For Fortran, use the following command:
gfortran -march=armv8-a+sve myapp.f90 -o myapp_f90.out
With GCC autovectorization is enabled with the -03
option. To disable autovectorization, use -fno-tree-vectorize
compiler option.
Compare the disassembly of a simple program shown below with and without the use of autovectorization:
Note the use of double-word register d0
, d1
instead of SVE registers z0.d
and z1
when you disable vectorization.
With GCC, the use of compiler option -fopt-info-vec
returns which loops were vectorized. To return which loop failed to vectorize, use the -fopt-info-vec-missed
compiler option.
In this example, the compiler reports the vectorization of loop line 3.
The Arm Performance Libraries include generic and target-specific SVE optimizations of common math operations used in HPC. To link your application with these libraries and GCC, use the predefined environment variables ARMPL_INCLUDES
and ARMPL_LIBRARIES
. The environment variables are set by the Arm Performance Libraries module files.
Refer to the Arm Performance Libraries install guide for more information.
gcc -O3 -march=armv8-a+sve -I $ARMPL_INCLUDES dgemm.c -o dgemm.out -L $ARMPL_LIBRARIES -larmpl
Shown below are example commands to compile an application with support for SVE instructions using Arm Compiler for Linux:
armclang -march=armv8-a+sve myapp.c -o myapp_c.out
armflang -march=armv8-a+sve myapp.f90 -o myapp_f90.out
If you are compiling for a SVE-capable target, you can use the -march=native
compiler option. For specific CPUs with SVE support, use the -mcpu
option:
CPU | Flag |
---|---|
Neoverse-N1 | -mcpu=neoverse-n1 |
Neoverse-V1 | -mcpu=neoverse-v1 |
With Arm Compiler for Linux autovectorization is enabled with the -02
option and above. To disable autovectorization, use -fno-vectorize
.
With Arm Compiler for Linux, the option -Rpass=vector
and -Rpass=sve-loop-vectorize
return which loops were vectorized. To return the loops that failed to vectorize, use -Rpass-missed=vector
.
To use Arm Performance Libraries with Arm Compiler for Linux use the -armpl=sve
option. This ensures the SVE version of the library is used. Example command shown here:
armclang -O3 -march=armv8-a+sve -armpl=sve dgemm.c -o dgemm.out