A Sobel filter implementation is used as the example application as it is an applicable embedded computer vision workload. The Sobel SIMD OpenCV repo is implemented in three different ways which makes it a great example candidate to show different aspects of porting.
It is implemented in the following ways:
x86_64
intrinsicsThe application builds and runs on an x86_64
machine. The application runs on CPU only (no hardware acceleration).
You will follow the porting methodology and gather information about the application.
version | ||
---|---|---|
Programming language | C++ | |
OS | Ubuntu | 22.04 LTS |
Compiler | GCC | 11.3.0 |
Build tools | CMake | 3.22.1 |
External libraries | OpenCV | 4.5.4 |
This table is the starting point for the porting analysis.
Using the original software and tool versions when porting an application isn’t a requirement, however it is recommended as it will make the porting smoother. By looking at the Sobel filter code and with the questions from the previous section in mind, you can start the porting analysis.
# Enable SIMD instructions for Intel Intrinsics
# https://software.intel.com/sites/landingpage/IntrinsicsGuide/
if(NOT WIN32)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mavx")
endif()
The flag -mavx
used with GCC is architecture-specific. It is only available on
x86_64
and will prevent the application from compiling. Even though the application won’t compile, no changes to the source code for the non-SIMD and OpenCV implementations are necessary.
In src/main.cpp#L26 :
/* GCC-compatible compiler, targeting x86/x86-64 */
#include <x86intrin.h>
The header file x86intrin.h
isn’t supported on aarch64
.
In src/main.cpp#L253 :
p1 = _mm_loadu_si128((__m128i *)(inputPointer + i * width + j));
p2 = _mm_loadu_si128((__m128i *)(inputPointer + i * width + j + 1));
p3 = _mm_loadu_si128((__m128i *)(inputPointer + i * width + j + 2));
The lines of code above is just a snippet from the function SobelSimd
which has intrinsics prefixed with _mm_
. These aren’t supported on aarch64
and will need to be ported for the application to compile on aarch64
.
The table below summarizes the migration analysis.
version | available on Arm | Comment | |
---|---|---|---|
Ubuntu | 22.04 LTS | Yes | Ubuntu for Arm |
GCC | 11.3.0 | Yes | |
CMake | 3.22.1 | Yes | |
OpenCV | 4.5.4 | Yes | |
Compiler option -mavx | N/A | No | x86-specific |
AVX intrinsics | N/A | No | x86-specific |
You can draw the following conclusions: