SortedDict with mutable values - why does this allocate?

I’m surprised that one of these allocates and the other doesn’t. I’m pretty sure that this isn’t specific to the package and that instead my surprise comes form me not understand Julia well enough. I wonder if someone here could illuminate.

using DataStructures
using BenchmarkTools

struct MyType1
    x::Float64
end
d = SortedDict{Float64,MyType1}()
@btime setindex!(d, MyType1(1.0), 1)  # 23.774 ns (0 allocations: 0 bytes)

mutable struct MyType2
    x::Float64
end
d = SortedDict{Float64,MyType2}()
@btime setindex!(d, MyType2(1.0), 1)  # 34.555 ns (2 allocations: 48 bytes)

Obviously the difference here is the mutability of the value struct. But why?

If I’m not mistaken, your timing includes the creation of the MyType2 object.

1 Like

The two cases are exactly the same in this respect. The only difference is that MyType1 is immutable and MyType2 is mutable.

Mutable structs are allocated, whereas immutable structs can be kept on the stack or in registers.

What do you mean, kept on the stack? In the example, the SortedDict isn’t on the stack, and the setindex! isn’t inside a function.

The dicts are typically not on the stack, but MyType1 is a “bits type”, whereas MyType2 is not. Try isbitstype(MyType1(1.0))and isbitstype(MyType2(1.0)). This has consequences for how they are allocated and treated. So we have e.g.

julia> MyType1(1.0) === MyType1(1.0)
true
julia> MyType2(1.0) === MyType2(1.0)
false

because two “different” MyType1 with the same Float64 inside are semantically indistinguishable, but two different MyType2 can be equal at one time, and different when one of them is changed. We also have:

julia> reinterpret(UInt8, [MyType1(1.0)])
8-element reinterpret(UInt8, ::Vector{MyType1}):
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0xf0
 0x3f
julia> reinterpret(UInt8, [MyType2(1.0)])
ERROR: ArgumentError: cannot reinterpret `MyType2` as `UInt8`, type `MyType2` is not a bits type

The first is the byte pattern of 1.0, the second would have been a byte pattern of a pointer, though you would need some unsafe trickery to see that.

In terms of memory storage and such things, MyType1 is treated just like a Float64, whereas MyType2 is treated more like a pointer to a Float64. Even though the syntax is similar. This means that an object of type MyType2 typically(always?) will be allocated on the heap, whereas an object of type MyType1 is typically not.

You can also try this:

@btime [MyType1(i) for i in 1:1000];
@btime [MyType2(i) for i in 1:1000];

to see the difference. The first one, a Vector{MyType1} has exactly the same memory layout as if you did [Float64(i) for i in 1:1000]. You can see it by doing a

reinterpret(Float64, [MyType1(i) for i in 1:1000])

That’s not quite true - @btime (and @benchmark) wrap the expression you give it in a function, to ensure the expression is compiled. As a result, MyType1(1.0) ends up allocated on the stack. Since MyType2 is mutable and the dictionary you’re writing to is a global object, it can’t be allocated on the stack and must survive while preserving object identity. As a result, it’s forced to be allocated on the heap.

Ok, good point that setindex! is indeex inside a function as it’s called by @btime.

However, it’s still the case that the SortedDict lives in the global space. Yes, I get that MyType1 is immutable and can go on the stack, but I’m still adding it to an object that lives in the global space. When @btime exits, where is the MyType1 object which was on the stack? Must be somewhere, as it’s accessible from the global SortedDict.

Thanks for that. I think you’re on to something.

I’d like to put to you the same question I just asked Sukera - where is the MyType1 when @btime returns? It’s not on the stack because the SortedDict is a global object.

The MyType1 object in the Dict is technically a copy, but a copy of an immutable is indistinguishable from the original. I.e. the MyType1 has been “copied” into some Vector inside the dict.

2 Likes

It’s stored inline in the array backing the SortedDict. Immutable isbits objects are compared by their bitwise identity, so the raw bits can simply be copied into the memory occupied by the array and stored inline there.

Note that this could still end up allocating if the SortedDict has to grow, which would require resizing (and thus reallocating) the backing array. This isn’t the case in your benchmark though.

2 Likes

Ok guys, this solves it. Thanks for the clarification.