Common allocation mistakes

lmiq · August 11, 2021, 10:20am

amrods · August 12, 2021, 11:07am

Question, in function that create arrays inside of them, it is better to pass those arrays as arguments? For example:

function f1(x)
    sto = similar(x)
    sto = x .+ 1
    return sto
end

function f2(sto, x)
    sto = x .+ 1
    return sto
end

So that in a loop I can reuse sto without Julia having to allocate memory for every iteration:

function g1(x)
    for i in 1:100
        sto = f1(x)
    end
end

function g2(x)
    sto = similar(x)
    for i in 1:100
        f2(sto, x)
    end
    return sto
end

is that the point?

lmiq · August 12, 2021, 11:14am

Yes, usually what I do is to let this as an alternative to the user, like this:

function f1(x;aux=similar(x))
    aux .= x .+ 1 # note the broadcasting here!
    return aux
end

Then the function can be called only with x, in which case it allocates aux, or you can pass aux as a keyword parameter, in which it reuses what is given.

@amrods note that you need .= to broadcast, otherwise the preallocated vector is not used for anything (this is incorrect in the code you posted there).

amrods · August 12, 2021, 11:15am

oh nice, i didn’t know you could reference other arguments at function definition.

DNF · August 12, 2021, 12:48pm

There is also Random.rand! which will fill the array with random values, and also make sure that they have the correct eltype. For example

A = rand(Int8, 3, 4);
rand!(A)

rand! will generate random Int8 values, etc.

TI36XPro · January 22, 2022, 2:48pm

These are great tips - thank you for writing this.

Once thing I recently learned related to views - and if I get any of this wrong please someone correct me - is that a view still allocates. For example, if you have a hot inner loop where you use small blocks of a larger vector you might think (or at least I naively did) that you can avoid heap allocations by using something like the following. I thought I could use the view to do allocation-free access and operations on elements of A.

A = rand(100)
B = rand(3)
function view_test(A, B)
    tmp = @view A[2:4]
    tmp .= tmp .* B
end

This is a silly example, but in my own hot inner loop I was reusing that view in places many times and re-creating each iteration of the loop. I ended up with tons of heap allocations everywhere that view was used (sorry if this is obvious to others!). I eventually found some posts indicating that if you have something like this you should just loop over the small block - which is what I ended up doing. In other locations with larger arrays I got around the problem by using copyto! or just made loops there as well.

lmiq · January 22, 2022, 2:59pm

That specifically should not allocate. But you can have allocations with views if the slice is of a non-contiguous memory block (not sure if this is true, what certainly happens is that views of non-contiguous slices are not very performant).

TI36XPro · January 22, 2022, 3:03pm

I get this when using @btime: 20.160 ns (1 allocation: 48 bytes)

jishnub · January 22, 2022, 3:32pm

What I obtain:

julia> @btime view_test($A, $B);
  169.941 ns (0 allocations: 0 bytes)

Perhaps you didn’t interpolate the variables?

julia> @btime view_test(A, B);
  195.690 ns (1 allocation: 48 bytes)

I’m amazed at how fast your system is btw

Vasily_Pisarev · January 22, 2022, 7:16pm

Which version of Julia are you using?

In older versions (1.0 for sure, then I don’t remember), structs with mutable members were heap-allocated. In the current stable and LTS versions (1.7 and 1.6, respectively), views should be stack-allocated but of course the stack-allocated data includes a reference to a heap-allocated backing array (except, of course, the cases when the backing array itself has isbits type, e.g. UnitRange or StaticArray).

TI36XPro · January 22, 2022, 8:19pm

I’m using 1.6.2

lmiq · January 22, 2022, 9:53pm

Even with variable interpolation?

1.6.3:

julia> @btime view_test($A,$B)
  80.945 ns (0 allocations: 0 bytes)

TI36XPro · January 23, 2022, 12:45am

Yes, interpolating then the allocation is zero.

What does that mean then? Where is the allocation coming from when you don’t interpolate?

lmiq · January 23, 2022, 1:13am

It is like the input variables were non-constant globals. See: julia - When to interpolate in benchmarking expressions - Stack Overflow

(I miss as well a more in depth explantation of these interpolations in BenchmarkTools, as a rule of thumb I understand that they should be always used)

TI36XPro · January 23, 2022, 1:41am

Ah okay - so the point then is that the view did not create an allocation, it was just a mistake in how I was using @btime.

TI36XPro · January 23, 2022, 1:56am

Just one more follow-up that I just remembered. I think allocation still may have been happening, in my particular case. I say this because I would use julia --track-allocation=user and Profile (making sure to clear the profiler before running the function) and I would see memory allocations next to the lines with views. Perhaps it was because they were views of non-contiguous sections of memory as you said in an earlier post.

lmiq · January 23, 2022, 2:15am

I tried to get allocations in those cases here, without success. Thus, I’m not sure if that still applies (or has anytime after 1.5). But it may.

At the same time I always found the output of --track-allocation=user confusing, I usually check manually the code after profiling with that to be sure of the source of the allocations.

alfaromartino · June 27, 2023, 4:09pm

One more approach for the case 2. to avoid allocations when broadcasting. The package LazyArrays allows you to do this and keep the notation simple: it only requires adding @~ to the broadcast, and you don’t need to pre-allocate the output.

Example:

using BenchmarkTools, LazyArrays
x,y = rand(100), rand(100)
tmp = similar(x)

#pre-allocating tmp
function f(x,y,tmp)
         z = 0.
         for i in 1:10
           @. tmp = x * y
           z += sum(tmp) 
         end
         z
end
@btime f($x,$y,$tmp) # 144.617 ns (0 allocations: 0 bytes)

#using LazyArrays
function g(x,y)
  z = 0.
  for i in 1:10
    tmp = @~ x .* y
    z += sum(tmp) 
  end
  z
end

f(x,y,tmp) == g(x,y) # true
@btime g($x,$y)      # 107.120 ns (0 allocations: 0 bytes)

sylvaticus · June 28, 2023, 7:57am

Why for case 2 using mapreduce (i.e. z += mapreduce(*, norm, x,y) ) still allocates ? In the doc it is explicitly said that it skip the intermediate array:

    mapreduce(f, op, A::AbstractArray...; dims=:, [init])

Evaluates to the same as `reduce(op, map(f, A...); dims=dims, init=init)`, but is generally faster because the intermediate array is avoided.

lmiq · June 28, 2023, 12:13pm

I think you have a tricky case there. You are first multiplying the elements of x and y, which would in principle create an intermediate array z .= x .* y and then computing the norm of z.

Computing the norm of an array can be done in-place, but I doubt that the compiler can handle that automatically for an array that is constructed lazily.

But, to be sincere, I don´t completely understand what mapreduce is doing there (I think the docs are not clear on what it does when more than one collection is provided). For instance, if that call to mapreduce makes sense, I would think that these two would give the same result:

julia> x = rand(10); y = rand(10);

julia> mapreduce(*, norm, x, y)
0.015890825295371242

julia> norm(xi*yi for (xi,yi) in zip(x,y))
1.0128277332826312

Or, if (that’s what I understand from the docs), the function f should be applied to each element to the collections, in which case I think that should error, as *(x) is not a defined operation. But that works even with a single collection, so I’m lost here:

julia> mapreduce(*, norm, [1,2])
1.0

Topic		Replies	Views
Memory allocation due to assignment in a fixed size mutable array General Usage	1	610	July 31, 2018
Reducing allocations when subtracting vectors and multiplying by scalars Performance memory-allocation , optimization	9	453	November 9, 2021
Right way to "static" allocate vectors in a function Performance	2	733	December 8, 2018
Where does this allocation come from and how to avoid it? StaticArrays and structures Performance question , function , struct , staticarrays , allocations	3	381	July 19, 2022
Allocation with StaticArrays New to Julia memory-allocation , staticarrays	19	680	September 15, 2021

Common allocation mistakes

Related topics