You can use a variety of open source tools and Linaro Forge to learn how to debug and optimize parallel applications.

Before you begin

You will need an Arm based instance from a cloud service provider or any Arm server or laptop running Linux.

The instructions are tested on Ubuntu 20.04. Other Linux distributions are possible with some modifications.

Install the required software

  1. Install Linaro Forge

Follow the instructions in the Linaro Forge install guide .

You can confirm Forge is installed by running:


            ddt --version
  1. Install the required Linux software

Use the Linux package manager to install the required tools:


            sudo apt update
sudo apt install -y make mpich \
  python-is-python3 python3-dev python3-numpy python3-scipy python3-mpi4py \
  lsb-release \
  bc \
  build-essential \
  gfortran \
  git \
  openmpi-bin \
  linux-tools-common linux-tools-generic linux-tools-`uname -r` \
  dos2unix \
sudo ln -s /usr/bin/f2py3 /usr/bin/f2py
  1. Enable profiling

Your Linux distribution may not allow some types of user profiling.

Allow profiling with perf by running:


            sudo sysctl -w kernel.perf_event_paranoid=1
  1. Download the example code

The example code is contained in a package you can download.

Use wget to download the example code:


tar xf arm_hpc_tools_trial.tar.gz
cd arm_hpc_tools_trial
  1. Change the format of the example files

The example code files are formatted with DOS line endings and need to be converted to Linux format.

Use a text editor to create a new file named and copy the contents below into the file:



dos2unix ./src/C/mmult.c
dos2unix ./src/F90/mmult.F90
dos2unix ./src/make.def
dos2unix ./src/Py/F90/mmult.F90
dos2unix ./src/Py/C/mmult.c
  1. Run the script to convert the files to unix format:

            bash ./

You are now ready to start learning about parallel application development.