Mali Offline Compiler is a command-line tool that you can use to compile all shaders and kernels from OpenGL ES and Vulkan, and generate a performance report for the GPU of interest.

Supported GPUs

To see the full list of supported GPUs use:

    

        
        
            malioc --list
        
    

To get information on API support for a given GPU, use:

    

        
        
            malioc --info --core <GPU_name>
        
    

Compile your shader

You can compile OpenGL ES (--opengles) and Vulkan (--vulkan) shader programs, as well as Open GL (--opengl <version>) C kernels (Linux host only).

A performance report will be generated.

An example (OpenGL ES) shader is provided in the documentation :

    

        
        
            #version 310 es
#define WINDOW_SIZE 5

precision highp float;
precision highp sampler2D;

uniform bool toneMap;
uniform sampler2D texUnit;
uniform mat4 colorModulation;
uniform float gaussOffsets[WINDOW_SIZE];
uniform float gaussWeights[WINDOW_SIZE];

in vec2 texCoord;
out vec4 fragColor;

void main() {
	fragColor = vec4(0.0);
	for (int i = 0; i < WINDOW_SIZE; i++) {
		vec2 offsetTexCoord = texCoord + vec2(gaussOffsets[i], 0.0);
		vec4 data = texture(texUnit, offsetTexCoord);
		if (toneMap) data *= colorModulation;
		fragColor += data * gaussWeights[i];
    }
}
        
    

Compile the shader for Mali-G76 with:

    

        
        
             malioc --core Mali-G76 shader.frag
        
    

The full list of available options can be seen with:

    

        
        
            malioc --help
        
    

For more information, refer to Compiling OpenGL ES shaders and Compiling Vulkan shaders in the Mali Offline Compiler User Guide.

Analyze the report

The report will provide an approximate cycle cost breakdown for the major functional units in the design. Use this information to optimize your shader.

For example, compiling the unoptimized implementation for Mali-G76 reports the following cycle information:

    

        
                                        A      LS       V       T    Bound
Total instruction cycles:    4.53    0.00    0.25    2.50        A
Shortest path cycles:        1.00    0.00    0.25    2.50        T
Longest path cycles:         4.53    0.00    0.25    2.50        A
A = Arithmetic, LS = Load/Store, V = Varying, T = Texture

        
    

An example optimization is explained in the documentation .

    

        
        
            #version 310 es
#define WINDOW_SIZE 5

// Lower precision to fp16
precision mediump float;
precision mediump sampler2D;

uniform bool toneMap;
uniform sampler2D texUnit;
uniform mat4 colorModulation;
uniform float gaussOffsets[WINDOW_SIZE];
uniform float gaussWeights[WINDOW_SIZE];

in vec2 texCoord;
out vec4 fragColor;

void main() {
	fragColor = vec4(0.0);
	for (int i = 0; i < WINDOW_SIZE; i++) {
		vec2 offsetTexCoord = texCoord + vec2(gaussOffsets[i], 0.0);
		vec4 data = texture(texUnit, offsetTexCoord);
		fragColor += data * gaussWeights[i];
    }
    // Tone map final color
	if (toneMap) fragColor *= colorModulation;
}
        
    

Compiling the optimized implementation reports:

    

        
                                        A      LS       V       T    Bound
Total instruction cycles:    0.96    0.00    0.25    2.50        T
Shortest path cycles:        0.54    0.00    0.25    2.50        T
Longest path cycles:         0.96    0.00    0.25    2.50        T
A = Arithmetic, LS = Load/Store, V = Varying, T = Texture

        
    

Observe that the number of Arithmetic cycles has been significantly reduced.

Understanding the output of the report is key to the usefulness of the Mali Offline Compiler. This brief video tutorial is an excellent starter.

Back
Next