Ubuntu Linux 20.04.
This Learning Path has been tested on AWS and Oracle platforms.
Launch an Arm-based instance running
Install build-essential and Python3 package dependencies
sudo apt-get update
sudo apt-get install -y build-essential
sudo apt-get install -y python3-pip
sudo apt-get install -y git
sudo pip install opencv-python-headless
sudo pip install Cython
sudo pip install pycotools
sudo pip install pybind11
You will use the MLPerf Inference benchmark suite from MLCommons to benchmark models for a widely used ML use-case such as Image classification and Object detection. Start by cloning the repository below.
git clone --recurse-submodules https://github.com/mlcommons/inference.git mlperf_inference
Next, build and install the MLPerf Inference Benchmark for the image classification and object detection use case using the steps below.
CFLAGS="-std=c++14" sudo python3 setup.py develop --user
sudo python3 setup.py develop
MLPerf Inference Benchmark suite can use different backends such as onnx or tensorflow. You will install Tensorflow as the backend. Install tensorflow using the commands below.
pip install tensorflow
pip install tensorflow-io
Set 2 environment variables:
AWS Graviton3 instances are the first instances with BF16 support.
Next, download the ML model you want to run the benchmark with. In this example, download the
wget -q https://zenodo.org/record/2535873/files/resnet50_v1.pb
You will also need to download a dataset for the ML model you want to benchmark. The imagenet2012 validation dataset is best used with this ML model. You can download the dataset after you register.
For this example, you will generate a fake image dataset using the tooling included in the repo. Use the command below:
Finally, before you run the benchmark you will need to setup the environment variables below to point to the location of the ML model and dataset.
You can now launch the benchmark on your Arm machine, using the command below.
./run_local.sh tf resnet50 cpu
This command runs the benchmark with the “tf” tensorflow backend on the “resnet50” ML model with the device set to “cpu”.
The minimal arguments that you need to pass to the benchmark are shown below
./run_local.sh backend model device
backend is one of [tf|onnxruntime|pytorch|tflite]
model is one of [resnet50|mobilenet|ssd-mobilenet|ssd-resnet34]
device is one of [cpu|gpu]
For all other options, run help as shown below
At the end of the benchmark run, the aggregated ML performance results are printed on the console. For example, using the command above, the output will be similar to:
TestScenario.SingleStream qps=13.88, mean=0.0719, time=600.153, queries=8333, tiles=50.0:0.0718,80.0:0.0731,90.0:0.0738,95.0:0.0743,99.0:0.0755,99.9:0.0771
Detailed results with breakdowns are available in the
output/tf-cpu/resnet50 folder. The folder name is dependent on the arguments passed to the benchmark script.