Why does using + instead of += double the number of allocations?

vedantroy · December 26, 2021, 12:31am

I have the following code that I’m playing around with (taken from here):

function vectorized()
    a = [1.0, 1.0]
    b = [2.0, 2.0]
    x = [0, 0]

    for i in 1:1000
        x = a + b
    end

    return x
end

@btime vectorized()

As expected, this does 1003 allocations.
However if I change: x = a + b to x += a + b the number of allocations doubles to 2003.

Does anyone know why?

As a bit of a side note: From my understanding Julia uses LLVM to compile functions at run time. Given that is the case, I’m surprised LLVM is not doing an optimization like recognizing a + b is always the same & then only allocating a + b once. More generally, I’m surprised LLVM is not optimizing the above code into the devectorized version.

Side Note 2: Is there a way I can debug stuff like this on my own? I.e, is there something that will spit out the underlying IR that is being generated?

Elrod · December 26, 2021, 12:38am

x += a + b is like x = x + (a + b).

a + b allocates a new array per iteration.
x + (a + b) also allocates a new array per iteration.

If you’d like to avoid the allocations, @. x = a + b or @. x += a + b are simple approaches that work.

Much of Julia’s memory allocations are a black box to LLVM, so it will/can not optimize them.

To see IR, I recommend Cthulhu. @descend lets you quickly switch between representations, such as typed Julia IR, LLVM IR, or asm. It also lets you descend into functions called from there to explore them as well.

Without a dependency, look at @code_warntype, @code_typed, @code_llvm, and @code_native. These don’t let you descend, so they’re overall less convenient.

StaticArrays is a popular library providing compile-time sized arrays. The SArray and MArray types are both much less opaque to the compiler, letting them be optimized much more aggressively.

vedantroy · December 26, 2021, 12:47am

@Elrod Thanks for the quick response, and I appreciate the link to Cthulhu.

One area where I sense a lack of my understanding is the purpose of LLVM then. Presumably the point of LLVM is to be able to do optimizations like constant folding in order to speed up generated code?

Elrod · December 26, 2021, 12:50am

The Julia side of the compiler also performs some optimizations like constant folding.

LLVM can/does perform many more though. It also performs constant prop, instruction selection, vectorization, various peep hole optimizations, the JIT/creating the machine code actually being run…

Topic		Replies	Views
Increase in allocations with Julia v1.11-beta Internals & Design performance , arrays	46	3125	May 8, 2024
How do I determine the memory allocations in Julia using @code_typed (or @code_llvm) General Usage	4	3056	September 4, 2018
Code uses excessive memory when using ```.+``` instead of a loop? New to Julia memory-allocation	3	348	October 6, 2023
Reducing allocations when subtracting vectors and multiplying by scalars Performance memory-allocation , optimization	9	453	November 9, 2021
Why does this allocate? Performance	4	449	June 30, 2020

Why does using + instead of += double the number of allocations?

Related topics