Types of Library

C++ libraries generally fall into two major categories, each serving different needs. This section walks you through both the standard library and performance libraries, and outlines the purpose and characteristics of each.

Standard C++ Library

The C++ Standard Library provides a collection of classes, functions, and templates that are defined by the C++ standard and are essential for everyday programming, such as:

  • Data structures.
  • Algorithms.
  • Input/output operations.
  • Utility functions.

Trade-offs between versatility and performance

The C++ Standard Library is designed to be versatile and easy to use, ensuring compatibility and portability across different platforms. This portability comes at a cost, however, and standard libraries have some limitations. Designers of performance-sensitive applications might wish to take advantage of the hardware’s full capabilities, and where they might be unable to do so through standard libraries, they can instead implement performance libraries that can bring these performance optimizations into effect.

Benefits of Performance libraries

Performance libraries are specialized for high-performance computing tasks and are often tailored to the microarchitecture of a specific processor. These libraries are optimized for speed and efficiency, often leveraging hardware-specific features such as vector units to achieve maximum performance.

Crafted through extensive benchmarking and optimization, performance libraries can be domain-specific - such as genomics libraries - or for general-purpose computing. For example, OpenRNG focuses on generating random numbers quickly and efficiently, which is crucial for simulations and scientific computations, whereas the C++ Standard Library offers a more general-purpose approach with functions such as std::mt19937 for random number generation.

Performance libraries for Arm CPUs - such as the Arm Performance Libraries (APL) - provide highly optimized mathematical functions for scientific computing. An analogous library for accelerating routines on a GPU is cuBLAS, which is available for NVIDIA GPUs.

These libraries can be linked dynamically at runtime or statically during compilation, offering flexibility in deployment. They are designed to support multiple versions of the Arm architecture, including those with NEON and SVE. Generally, only minimal source code changes are required to use these libraries, making them ideal for porting and optimizing applications.

How do I choose the right version of a performance library?

Performance libraries are often distributed in multiple formats to support various use cases:

  • ILP64 uses 64 bits for representing integers, which are often used for indexing large arrays in scientific computing. In C++ source code, one uses the long long type to specify 64-bit integers.

  • LP64 uses 32 bits to represent integers which are more common in general-purpose applications.

  • Open Multi-Processing (OpenMP) is a cross-platform programming interface for parallelizing workloads across many CPU cores, such as x86 and AArch64. Programmers interact primarily through compiler directives, such as #pragma omp parallel indicating which section of source code can be run in parallel and which sections require synchronization.

Arm Performance Libraries, in common with their x86 equivalent, Open Math Kernel Library (MKL), provide optimized functions for both ILP64 and LP64, as well as for OpenMP or single-threaded implementations.

Additionally, interface libraries are available as shared libraries for dynamic linking, such as those with a .so file extension, or as static linking, such as those with a .a file extension.

Which performance library should I choose?

A natural source of confusion stems from the plethora of similar performance libraries. For example, OpenBLAS and NVIDIA Performance Libraries (NVPL) each offer their own implementation of basic linear algebra subprograms (BLAS). This raises the question: which one should a developer choose?

Multiple performance libraries exist to meet the diverse needs of different hardware architectures and applications. For instance, Arm performance libraries are optimized for Arm CPUs, leveraging unique instruction sets and power efficiency. Meanwhile, NVIDIA performance libraries for Grace CPUs are tailored to maximize the performance of NVIDIA’s hardware.

Here are some of the different types of performance libraries available:

  • Hardware-specialized - some libraries are designed to be cross-platform, supporting multiple hardware architectures to provide flexibility and broader usability. For example, the OpenBLAS library supports both Arm and x86 architectures, allowing developers to use the same library across different systems.

  • Domain-specific - libraries are often created to handle specific domains or types of computations more efficiently. For instance, libraries like cuDNN are optimized for deep learning tasks, providing specialized functions that significantly speed up neural network training and inference.

  • Commercial - some highly-performant libraries require a license to use. This is more common in domain-specific libraries such as computational chemistry or fluid dynamics.

These factors contribute to the existence of multiple performance libraries, each tailored to meet the specific demands of various hardware and applications.

Invariably, there will be performance differences between each library and the best way to observe them is to use the library within your own application.

For more information on performance benchmarking, see Arm Performance Libraries 24.10 .

What performance libraries are available on Arm?

For a directory of community-produced libraries, see the Software Ecosystem Dashboard for Arm .

Each library might not be available as a binary and you might need to compile it from source. The table below gives examples of libraries that are available on Arm.

Package / LibraryDomain
Minimap2Long-read sequence alignment in genomics
HMMERBioinformatics library for homologous sequences
FFTWOpen-source Fast Fourier Transform Library

See the Software Ecosystem Dashboard for Arm for the most comprehensive and up-to-date list.

Back
Next