Unexpected allocations when acessing field of struct

Hello,

I don’t understand why any of the function calls underneath allocate memory on the heap.
My reasoning is that these are unmutable structs, so when the functions are called, the type of field ´a´ is known to be a Float64 and floats don’t need to be allocated on the heap.
Where am i wrong in my reasoning? And why does the only function call where ´@code_warntype´ wouldn’t know the type, allocate no memory?

using BenchmarkTools

struct untyped
    a
end

struct typed{T}
    a::T
end

function f(x)
    x.a
end

function f2(x)
    x.a::Float64
end

unt = untyped(5.0)
t = typed(5.0)

@btime f(unt)  #18.837 ns (0 allocations: 0 bytes)
@btime f(t) #21.342 ns (1 allocation: 16 bytes)
@btime f2(unt) #22.067 ns (1 allocation: 16 bytes)
@btime f2(t) #21.406 ns (1 allocation: 16 bytes)

This isn’t quite how it works. When you call f(unt), the compiler specializes f on the type of unt. That type (untyped), tells the compiler nothing about the type of the field a, which can be different for any different untyped instance. On the other hand, when you call f(t), the compiler specializes f on the type typed{Float64}, which does provide all the information needed for the type of a. That’s why the untyped version is not recommended for performance-sensitive code: Performance Tips · The Julia Language

Sure, but if the compiler doesn’t /know/ that a value will be a float, then it has to create a generic box to hold that value, and that might require heap allocation.

You’re actually seeing an artifact of the way BenchmarkTools works. You need to inerpolate local variables into the benchmark expression using $, otherwise you are actually benchmarking the time required to look up unt and t at global scope. Here are the results after fixing this:

julia> @btime f($unt)
  0.016 ns (0 allocations: 0 bytes)
5.0

julia> @btime f($t)
  0.016 ns (0 allocations: 0 bytes)
5.0

julia> @btime f2($unt)
  0.016 ns (0 allocations: 0 bytes)
5.0

julia> @btime f2($t)
  0.016 ns (0 allocations: 0 bytes)
5.0

Don’t believe those < 1ns timings. What this actually tells you is that the compiler has completely optimized the code away to nothing. This is always a challenge in benchmarking–if a function is sufficiently simple, then the compiler can just figure out how to skip it entirely. That’s a perfectly valid solution, but it’s usually not what you want. You can fix this by “hiding” the value behind a Ref:

julia> @btime f(x[]) setup=(x = Ref($unt))
  1.448 ns (0 allocations: 0 bytes)
5.0

julia> @btime f(x[]) setup=(x = Ref($t))
  1.238 ns (0 allocations: 0 bytes)
5.0

Unfortunately, this also isn’t very informative: the compiler has still managed to skip the allocation in the untyped case and made it just as fast as the typed case. Once again, the compiler has outsmarted our benchmark and made it too efficient :slight_smile:

Making the function slightly less trivial results in typed finally actually being faster than untyped, as expected:

julia> f3(x) = x.a + 1
f3 (generic function with 1 method)

julia> @btime f3(x[]) setup=(x = Ref($unt))
  15.102 ns (1 allocation: 16 bytes)
6.0

julia> @btime f3(x[]) setup=(x = Ref($t))
  1.446 ns (0 allocations: 0 bytes)
6.0
5 Likes

Thank you very much for the clear and fast response @rdeits !