Question about internal representation of Union{Missing, Float64}

I am trying to understand the internals of the datatype Union{Missing, Float64}. From what I understand, this type encodes the presence of elements in the array by means of a hidden Array{UInt8}, which flags which elements are present and which are not.

I tried to determine if this was true using sizeof, but I found an unexpected result:

julia> x = [3.0, 4.0, 5.0]
3-element Array{Float64,1}:
 3.0
 4.0
 5.0

julia> y = [3.0, missing, 5.0]
3-element Array{Union{Missing, Float64},1}:
 3.0     
  missing
 5.0     

julia> sizeof(x), sizeof(y)
(24, 24)

I don’t understand why is sizeof reporting the same size for x and y, I would have expected 24 bytes for x and 27 (24 + 3) for y.

May somebody explain what’s going on? Perhaps sizeof is not the right way to measure the amount of allocated memory for union types?

cf

Thanks a lot! From that thread I learned that the test should use Base.summarysize instead of sizeof:

julia> Base.summarysize([3.0, missing, 5.0])
80

julia> Base.summarysize([3.0, 4.0, 5.0])
64
1 Like