Getting Allocations with Static Arrays


I am trying to optimize some code and have some questions on behavior I am observing related to static arrays. I have read that if you have lots of small arrays (less than 100 elements) then static arrays are one option to increase performance of your code. Since they are stack allocated they are faster to access (I think).

I have some code where I need to create a 3-element vector in each iteration of a hot loop. Right now I have a function that looks like this (which has zero allocations)

function compute_coefficients!(cache_array, MyListofArgs...)
    # Compute individual elements
    a1 = ...
    a2 = ...
    a3 = ...
    # Put into cache array
    cache_array[1] = a1+a2
    cache_array[2] = a2 - a3
    cache_array[3] = a3*a2-a1

I thought this would be a good use case for a static arrays since it is a small array (maybe this is not a good use case, hoping someone can correct me if so). My initial impression was that since static arrays are stack allocated returning one should be like returning an Int or a Float64. So I tried to implement this instead

function compute_coefficients(MyListofArgs...)
    # Compute individual elements
    a1 = ...
    a2 = ...
    a3 = ...
    # Put into SVector
    tmp = SVector{3,Float64}(a1+a2, a2 - a3, a3*a2-a1)
    return tmp

Now I have checked using @allocated and the line starting with tmp = ... inside compute_coefficients does not allocate (which is consistent with my expectations). But when I call this function from one level up (from inside my loop) I do get an allocation.

function outer_func()
# Lots of code in here
# Inside a hot inner loop I am calling
    tmp = compute_coeff(MyListofArgs...) # <-- @allocated says this line allocates

What I don’t understand is why in the outer function I am getting an allocation. I saw from this post that static arrays do not guarantee a stack allocation. But I don’t understand why there is no heap allocation inside compute_coeff but there is when I call it from the outer function.

I have also tried adding type annotations to both the function output (i.e., compute_coefficients(MyListofArgs...)::SVector{3,Float64}) and the tmp in the outer function (i.e., tmp::SVector{3,Float64}) but it did not make a difference.

I think you have excluded too much of your code for us to be able to tell what is going on. Here:

for example, I can’t tell where MyListofArgs comes from. Is it a global variable? That would explain it.

You have made your example minimal, but not working. It should be a Minimal Working Example, so people can copy and run your code and analyze it. Creating an MWE can be a bit of work, but you often end up understanding your own problem better, and perhaps solve it yourself in the process.

To answer your question - there are no global arguments.

I will try to work up an MWE later today, it may be difficult as even outer_func() is nested within other function calls.

Does tmp = @SVector([a1+a2, a2 - a3, a3*a2-a1]) change anything ?

Hm - what do you mean?
I thought it was creating a new static vector and assigning it to a variable named tmp.

Maybe I misread the original post. As @DNF said your best bet here is to work out a mwe reproducing the allocations.