The cheat sheet for the wperf
command line tool focuses specifically on counting and sampling commands. It includes wperf stat
for counting occurrences of specific PMU events and wperf sample
and wperf record
for sampling PMU event. Each command is explained with practical example.
inst_spec
, vfp_spec
, ase_spec
and ld_spec
on core #0 for 3 seconds:
wperf stat -e inst_spec,vfp_spec,ase_spec,ld_spec -c 0 --timeout 3
imix
(metric events will be grouped) and additional event l1i_cache
on core #7 for 10.5 seconds:
wperf stat -m imix -e l1i_cache -c 7 --timeout 10.5
imix
3 times on core #1 with 2 second intervals (delays between counts). Each count will last 5 seconds:
wperf stat -m imix -c 1 -t -i 2 -n 3 --timeout 5
python_d.exe –c 10**10**100
to core no. 1 and sample given image name:
start /affinity 2 python_d.exe -c 10**10**100
wperf sample -e ld_spec:100000 -c 1 --pe_file python_d.exe --image_name python_d.exe
Same workflow can be wrapped with wperf record
command, see example below:
python_d.exe -c 10**10**100
process and start sampling event ld_spec
with frequency 100000
on core no. 1 for 30 seconds.
wperf record -e ld_spec:100000 -c 1 --timeout 30 -- python_d.exe -c 10**10**100
Add --annotate
or --disassemble
to wperf record
command line parameters to increase sampling “resolution”.
Use Arm SPE optional extension to sample on core no. 1 process python_d.exe
. SPE filter load_filter
/ ld
enables collection of load sampled operations, including atomic operations that return a value to a register.
Note: Double-dash operator --
can be used with SPE as well to launch the process.
wperf record -e arm_spe_0/ld=1/ -c 1 -– python_d.exe -c 10**10**100
Above command can be replaces by below two commands:
start /affinity 2 python_d.exe -c 10**10**100
wperf sample -e arm_spe_0/ld=1/ -c 1 --pe_file python_d.exe --image_name python_d.exe
Add --annotate
or --disassemble
to wperf record
command line parameters to increase sampling “resolution”.