Before you begin

Mali Offline Compiler is a command-line tool that you can use to compile shaders and kernels from OpenGL ES and Vulkan. The tool generates a performance report for the GPU of interest.

To test that Mali Offline Compiler is installed correctly, run:

    

        
        
malioc --help

    

The --help option returns usage instructions and the full list of available options for the malioc command.

Note

On macOS, Mali Offline Compiler might not be recognized as an application from an identified developer. To enable Mali Offline Compiler, open System Preferences > Security & Privacy, and select Allow Anyway for the malioc item.

To see the full list of supported GPUs , use:

    

        
        
malioc --list

    

For more information about API support for a given GPU, use:

    

        
        
malioc --info --core <GPU_name>

    

Compile your shader

You can compile OpenGL ES (--opengles) and Vulkan (--vulkan) shader programs. On Linux hosts, you can also compile OpenGL (--opengl <version>) C kernels.

Mali Offline Compiler generates a performance report.

If your frame analysis points to shader cost, compile one of your shaders. You can also use this sample shader to learn how to read the report.

The following example (OpenGL ES) shader is provided in Compile your shader in Arm documentation:

    

        
        
#version 310 es
#define WINDOW_SIZE 5

precision highp float;
precision highp sampler2D;

uniform bool toneMap;
uniform sampler2D texUnit;
uniform mat4 colorModulation;
uniform float gaussOffsets[WINDOW_SIZE];
uniform float gaussWeights[WINDOW_SIZE];

in vec2 texCoord;
out vec4 fragColor;

void main() {
	fragColor = vec4(0.0);
	for (int i = 0; i < WINDOW_SIZE; i++) {
		vec2 offsetTexCoord = texCoord + vec2(gaussOffsets[i], 0.0);
		vec4 data = texture(texUnit, offsetTexCoord);
		if (toneMap) data *= colorModulation;
		fragColor += data * gaussWeights[i];
    }
}

    

Compile the shader for Mali-G76 with:

    

        
        
 malioc --core Mali-G76 shader.frag

    

To view the full list of available options, run:

    

        
        
malioc --help

    

For more information, see Compiling OpenGL ES shaders and Compiling Vulkan shaders in the Mali Offline Compiler User Guide.

Interpret the report

The report provides an approximate cycle cost breakdown for the major functional units in the design. Use this information to optimize your shader.

For example, compiling the unoptimized implementation for Mali-G76 reports the following cycle information:

    

        
                                        A      LS       V       T    Bound
Total instruction cycles:    4.53    0.00    0.25    2.50        A
Shortest path cycles:        1.00    0.00    0.25    2.50        T
Longest path cycles:         4.53    0.00    0.25    2.50        A
A = Arithmetic, LS = Load/Store, V = Varying, T = Texture

        
    

An example optimization is described in Optimize your shader in Arm documentation:

    

        
        
#version 310 es
#define WINDOW_SIZE 5

// Lower precision to fp16
precision mediump float;
precision mediump sampler2D;

uniform bool toneMap;
uniform sampler2D texUnit;
uniform mat4 colorModulation;
uniform float gaussOffsets[WINDOW_SIZE];
uniform float gaussWeights[WINDOW_SIZE];

in vec2 texCoord;
out vec4 fragColor;

void main() {
	fragColor = vec4(0.0);
	for (int i = 0; i < WINDOW_SIZE; i++) {
		vec2 offsetTexCoord = texCoord + vec2(gaussOffsets[i], 0.0);
		vec4 data = texture(texUnit, offsetTexCoord);
		fragColor += data * gaussWeights[i];
    }
    // Tone map final color
	if (toneMap) fragColor *= colorModulation;
}

    

Compiling the optimized implementation reports:

    

        
                                        A      LS       V       T    Bound
Total instruction cycles:    0.96    0.00    0.25    2.50        T
Shortest path cycles:        0.54    0.00    0.25    2.50        T
Longest path cycles:         0.96    0.00    0.25    2.50        T
A = Arithmetic, LS = Load/Store, V = Varying, T = Texture

        
    

Observe that the number of total Arithmetic cycles has been significantly reduced from 4.53 to 0.96.

To learn more about interpreting Mali Offline Compiler reports, see the Arm GPU Training - Episode 3.5: Mali Offline Compiler video tutorial.

What you’ve accomplished

You’ve used Mali Offline Compiler to analyze shader performance on a Mali-based GPU of interest.

You can use the components and workflows described in this Learning Path to profile your applications and analyze performance using Arm Performance Studio.

You can also explore the following supporting tools:

Back
Next