Where does this allocation come from and how to avoid it? StaticArrays and structures

Hello!

I’m trying to optimize my code and I’m stuck on this. Here is a minimum working example of my problem.

f(x :: T, y :: T, z :: T) = SA[x, y, z]

struct Foo{T <: Real}
    x :: T; y :: T; z :: T
end

f(foo :: Foo) = SA[foo.x, foo.y, foo.z]

The problem is difference in the number of allocations when this function is executed

julia> @btime f(1.0,2.0,3.0)
  0.010 ns (0 allocations: 0 bytes)
3-element SVector{3, Float64} with indices SOneTo(3):
 1.0
 2.0
 3.0
julia> foo = Foo(1.0, 2.0, 3.0)
Foo{Float64}(1.0, 2.0, 3.0)

julia> @btime f(foo)
  8.464 ns (1 allocation: 32 bytes)
3-element SVector{3, Float64} with indices SOneTo(3):
 1.0
 2.0
 3.0

Where do this allocation arise? Can someone explain it or point in the right direction for me to research?

Thanks in advance!

You’re just seeing benchmarking artifacts. You need to use $ to interpolate values when benchmarking, as discussed in more detail in the manual: Manual · BenchmarkTools.jl

julia> using StaticArrays, BenchmarkTools

julia> struct Foo{T <: Real}
           x :: T; y :: T; z :: T
       end

julia> f(foo :: Foo) = SA[foo.x, foo.y, foo.z]
f (generic function with 1 method)

julia> foo = Foo(1., 2., 3.)
Foo{Float64}(1.0, 2.0, 3.0)

julia> @btime f($foo)
  1.449 ns (0 allocations: 0 bytes)

Furthermore, when you see a benchmark returning 0.010 ns (or really anything less than 1ns), then that just means the compiler was smart enough to optimize your entire code away. See Manual · BenchmarkTools.jl for more.

3 Likes

Thanks!

So is there an actual difference between the performance of two different functions?

Nope, I’d expect both to be about equally fast.