I’m trying to figure out how much memory a mutable struct actually takes up. For example, in Java or C#, there’s a vptr and a syncroot/hash id, so the overhead is 8 or 16 bytes, depending on if it’s being run in 32 or 64-bit. Julia’s sizeof()
seems to only return the size of the user-defined fields and does not include the object header. I’d also be interested to know what is stored in the header, just out of curiosity. Thanks!
Perhaps Base.summarysize
is what you’re looking for. (EDIT: I guess it’s not.)
I think the following should include all overheads:
julia> function bar(arr::Vector{Base.RefValue{T}}) where {T}
for i=1:length(arr)
arr[i]=Ref{T}()
end
nothing
end
julia> for N=1:32
T=NTuple{N, UInt32}
arr=[Ref{T}() for i=1:1024]
bar(arr)
bar(arr)
alloc = (@allocated bar(arr))/1024
println("Object with 4 byte align, size $(N*4) takes $alloc bytes")
end
Object with 4 byte align, size 4 takes 16.0 bytes
Object with 4 byte align, size 8 takes 16.0 bytes
Object with 4 byte align, size 12 takes 32.0 bytes
Object with 4 byte align, size 16 takes 32.0 bytes
Object with 4 byte align, size 20 takes 32.0 bytes
Object with 4 byte align, size 24 takes 32.0 bytes
Object with 4 byte align, size 28 takes 48.0 bytes
Object with 4 byte align, size 32 takes 48.0 bytes
Object with 4 byte align, size 36 takes 48.0 bytes
Object with 4 byte align, size 40 takes 48.0 bytes
Object with 4 byte align, size 44 takes 64.0 bytes
Object with 4 byte align, size 48 takes 64.0 bytes
Object with 4 byte align, size 52 takes 64.0 bytes
Object with 4 byte align, size 56 takes 64.0 bytes
Object with 4 byte align, size 60 takes 80.0 bytes
Object with 4 byte align, size 64 takes 80.0 bytes
Object with 4 byte align, size 68 takes 80.0 bytes
Object with 4 byte align, size 72 takes 80.0 bytes
Object with 4 byte align, size 76 takes 96.0 bytes
Object with 4 byte align, size 80 takes 96.0 bytes
Object with 4 byte align, size 84 takes 96.0 bytes
Object with 4 byte align, size 88 takes 96.0 bytes
Object with 4 byte align, size 92 takes 112.0 bytes
Object with 4 byte align, size 96 takes 112.0 bytes
Object with 4 byte align, size 100 takes 112.0 bytes
Object with 4 byte align, size 104 takes 112.0 bytes
Object with 4 byte align, size 108 takes 128.0 bytes
Object with 4 byte align, size 112 takes 128.0 bytes
Object with 4 byte align, size 116 takes 128.0 bytes
Object with 4 byte align, size 120 takes 128.0 bytes
Object with 4 byte align, size 124 takes 144.0 bytes
Object with 4 byte align, size 128 takes 144.0 bytes
According to https://pkg.julialang.org/docs/julia/THl1k/1.1.1/devdocs/object.html tuples have a unique representation, so the prior example may not universally apply. From the documentation it appears that a Julia struct itself does not contain extra fields but that it is wrapped in a c struct that contains GC and type information.
Thanks for the input everyone, it seems like we still don’t have a really concrete answer yet. I did a very unscientific test where I created 100M structs and put them in an array:
mutable struct MemTester
i::Int64
end
testers = Array{MemTester}(undef, 100000000)
function memoryTest()
for i = 1:100000000
testers[i] = MemTester(i)
end
end
I ran this and looked at the total process memory usage on a 64 bit REPL, which was roughly 3.2GB. That’s 32 bytes per item. We are keeping them in an array to avoid the GC collecting them, so figure 8 bytes per item pointer. And then there’s an 8 byte Int64 user field per object, so that leaves 16 bytes for the header. In general, I don’t trust OS heap allocation measurements to be particularly accurate, but that’s at least a fairly decent guess. I think.
EDIT: This is wrong, I’m using a non-const array which causes extra allocations, see below
Your memoryTest()
needs access to the global non-const object Main.testers
. This causes additional allocations. Use @timev
after warm-up to see both the number of allocs and the number of allocated bytes, and divide to obtain the mean size of allocated objects. With your code, I get 16 bytes/object and 1.5 allocations per iteration. After inspecting @code_native, this is surprisingly not due to loop unrolling; instead there must be some optimization for jl_box_int64
going on.
Hmm… I figured it was preallocated, so it wouldn’t cause additional allocations. Is that not how arrays work? Why would array accesses cause allocations?
The issue is the call setindex!(testers, MemTester(i), i)
, i.e. testers[i] = MemTester(i)
. The first argument is of unknown type (because it is a global variable), therefore julia needs to call into the runtime: It calls a C function that walks the method table and figures out whom to call.
I think the issue is that this C function expects arguments that are pointers to valid heap-allocated julia objects and then extracts the argument types from their headers. We already have a pointer to MemTester(i)
, but we also need a pointer to an object that contains the integer i
and an object header that tells the runtime that it is an integer. We obtain this object via jl_box_int64
. That appears to allocate sometimes.
TLDR: Don’t use non-const globals. Really, don’t.
Ohhh I had no idea about the global const stuff. Thanks for that.
So, yes, you’re right, after running the experiment again, it’s 16 bytes per object, so only an 8 byte header. Nice!