Can the compiler hoist constant structure out of the loop?

WalterMadelim · January 25, 2026, 2:00pm

In practice I may want to write this function (only r.x is the object being considered, other parts are less relevant)

function f!(r, I)
    for i=I
        r.x[i] = rand(Int)
    end
end

Assume now the structure r.x is fixed throughout. So a foolproof way to write this would be

function g!(r, I)
    x = r.x # introduce an additional local name (I write `x` here)
    for i=I
        x[i] = rand(Int)
    end
end

But g! is lengthier. I wonder if julia’s compiler can directly perform the transformation to g! under the hood so I can just keep writing f! at the front end. Can I?

Test

Here is a simple test. But in practice my r.x may mean a field access from a NamedTuple r as well

function test(N)
    r = Ref(rand(Int, N))
    I = Base.OneTo(N)
    @time g!(r, I)
    @time f!(r, I)
    @time g!(r, I)
    @time f!(r, I)
end
test(9999999)

Mason · January 25, 2026, 2:26pm

The pedantic answer is “It depends on what r and I are.” Julia functions have unconstrained polymorphism, so without knowing what r and I are, this function could do literally anything.

That said, a somewhat safe assumption here would be that r is some form of possibly mutable struct containing a field x which has an array in it, and is using generic methods for getproperty and setindex!, and I is some well behaved iterable of integer indices. In this case, the answer is maybe, this is the sort of thing julia’s compiler is often good at hoisting out of loops, but there’s often some catches.

However, you may run into trouble when r is a mutable type, because then it’s really up to the compiler to decide if it is legal to hoist the pointer loading out of the loop. Your benchmark would appear to suggest that no, in this case the compiler decides not to perform the hoisting. You can check the code_llvm output of f! and g! to see for yourself what exactly it decides to do.

Benny · January 25, 2026, 8:37pm

Can we, or rather can the compiler? What’s stopping another thread with a reference to the same object assigned to r from reassigning the x property?

WalterMadelim · January 26, 2026, 12:55am

Yes, if the field x is mutable, then the compiler cannot do hoisting.

But I found that it appears that I can get performance gain by manually do hoisting even if the x is immutable, e.g. r = (x = [1,2],). I think in practice I have to bother doing manual hoisting

WalterMadelim · January 26, 2026, 1:41am

I think a more appropriate API design is

_f!(r, I) = for i=I
    r[i] = rand(Int)
end;
f!(r, I) = _f!(r.x, I)

so my concern should be dispelled.

foobar_lv2 · January 26, 2026, 9:41am

Nobody is stopping another thread from reassigning the x property. However, the compiler does not need to care, and this does not stop the compiler from hoisting the x:

If there is no atomic / acquire fence in the loop, and some other thread happens to reassign the x property… then this is a bad undef/poison race condition.

The compiler is at liberty to instead spawn “nasal bats”, i.e. do almost anything at all. (write goes to the updated x – correct program execution. Write goes to the old x due to hoisting – also correct program execution. Write goes into some internal datastructure, ransomware is downloaded and encrypts your hard-drive – also correct program execution)

The compiler can hoist if it can prove that nothing inside the loop either updates x or is an atomic acquire.

This of course only happens if the loop body is pretty small – the compiler is not that smart. Luckily, this hoisting only matters if the loop body is pretty small: If the load of x can amortize over a long and expensive loop body, then hoisting of the load is less than a rounding error in terms of performance.

The real thing to look out for are loop bodys that are typically small and fast, but contain very rarely executed code for edge-cases that do complicated stuff. For example, a 1 in a billion chance of having to write a debug log message (which interacts with io → needs a lock → is an acquire → no hoisting).

Theoretically LLVM has passes for that kind of thing (put the reload of x into the rare condition). But I found that not very reliable.

Topic		Replies	Views
Localizing variables: performance implications? Performance	7	1005	January 18, 2018
Why the compiler can't optimize this simple code? Performance	20	823	November 14, 2024
Will field extraction be recognized as a loop invariant? General Usage question	3	485	July 14, 2019
Why is LLVM missing this constant in a loop? Performance performance , llvm	10	863	December 11, 2021
Why is changing type of variable problematic? Performance	19	1757	June 18, 2018

Can the compiler hoist constant structure out of the loop?

Related topics