Compiler Optimisations

Page Contents

References

  1. Software optimization resources, Agner Fog

Optimisations

Function Inlining

Constant Folding And Propogation

Pointer elimination

Common subexpression elimination

Register variables

Live range analysis

Join identical branches

Eliminate jumps

Loop unrolling

Loop invariant code motion

Induction variables

## See https://godbolt.org/z/0Ys5iX
#include <stddef.h>
void f(void)
{
    extern void dosomething(int *v);
    extern int a[300];

    for (size_t i = 0; i < 100; i++)
    {
        dosomething(&a[i*3]);
    }
}
# Optimisation level -O3
f:
        stp     x29, x30, [sp, -32]!
        mov     x29, sp
        stp     x19, x20, [sp, 16]
        adrp    x19, a
        add     x19, x19, :lo12:a
        add     x20, x19, 1200
.L2:
        mov     x0, x19
        add     x19, x19, 12
        bl      dosomething
        cmp     x20, x19
        bne     .L2
        ldp     x19, x20, [sp, 16]
        ldp     x29, x30, [sp], 32
        ret
# No optimisation
f:
        stp     x29, x30, [sp, -32]!
        mov     x29, sp
        str     xzr, [sp, 24]
        b       .L2
.L3:
        ldr     x0, [sp, 24]
        lsl     x0, x0, 2
        lsl     x1, x0, 2
        adrp    x0, a
        add     x0, x0, :lo12:a
        add     x0, x1, x0
        bl      dosomething
        ldr     x0, [sp, 24]
        add     x0, x0, 1
        str     x0, [sp, 24]
.L2:
        ldr     x0, [sp, 24]
        cmp     x0, 99
        bls     .L3
        nop
        ldp     x29, x30, [sp], 32
        ret

How cool - the compiler has re-written our code! It seems it was pretty easy for the compiler to do this as it does this even with optimisation level 1. With no optimisation we can still see some compiler made modifications to our code:

Scheduling

Algebraic reductions

Optimisation Killers

Cannot optimize across modules

Pointer aliasing

Dynamic memory allocation

Pure functions

Virtual functions and function pointers

Algebraic reduction

Floating point induction variables

Inlined functions have a non-inlined copy

Dependency Chains and Out Of Order Execution