True memory footprint of objects?

Consider:

julia> a = Vector{Int}([1])
1-element Array{Int64,1}:
 1

julia> whos()
                          Base  37457 KB     Module
                        Compat   4875 KB     Module
                          Core  12378 KB     Module
                          Main  44219 KB     Module
            TerminalExtensions   4823 KB     Module
                             a      8 bytes  1-element Array{Int64,1}
                           ans      8 bytes  1-element Array{Int64,1}

Now, a presumably has some overhead associated with it. (That is, vectors have O(1) length(), so perhaps there’s some sort of counter allocated somewhere? Are there other allocations?) However, whos() doesn’t seem to show the true cost of vector allocation.

Is there a way to find out exactly how much memory an object is taking? This may not seem generally important, but in LightGraphs we allocate empty vectors for each node, and these take up significant amounts of space even though they hold zero elements. It would be nice to be able to provide sizing guidelines to our users (“Each node takes X bytes; each edge adds Y”).

1 Like

This isn’t an answer, but perhaps something useful. Since Vector{Int} is a boxed type, it can be left in an #undef state to save memory.

For instance

const adjlists = Vector{Vector{Int}}(100)
function getadjlist(n)
    if isassigned(adjlists, n)
        adjlists[n]
    else
        Int[]
    end
end

This stores null pointers instead of vectors and only allocates the empty vector when required.

That’s actually a really good idea. Let me think about how we could implement that.

It would still be useful to know exactly how big the structure is.

It would be useful to be able to find out how much total memory is reachable from a set of objects. At this point I’m not entirely sure what whos() is even reporting.

2 Likes

The true total memory cost is too fuzzy to define. It’s nonsensical to try to talk about the exact amount. Most of the time, you can get away with assuming a factor of 2-3x for estimating the ratio of amount of memory reserved to the amount of memory used.

But for Arrays, it’s just a known bug that the summarysize function forgets to add in the cost of the array header when computing the memory usage that you see reported by whos there. It would be great to have that better accounted for, if you would like to make a PR.

It’s more than just not including the total amount allocated for the header (I calculated 40 bytes in the actual C structure for the header on a 64-bit platform, but I don’t know how much is actually allocated, maybe 48 (to keep 16 byte alignment), or 64 bytes (next power of 2, useful also if allocations are cache line aligned to avoid false cache line sharing).
It also needs to reflect the total amount allocated for the data field. I don’t think that’s fuzzy at all, you can definitely get the exact cost of an Array.

as a side question: what is stored in the array header?

1 Like

Possibly among other things, the information stored includes the shape information (size along each dimension), a pointer to the array type, and a pointer to the array data.

Since we’re in the dev channel — you can see the definition in julia.h. In short, it’s just a pointer to the data, a few flags about how the array was allocated, and some size information. When allocating a new small array, the header and data segment are allocated all at once in one large chunk, with the data pointer just pointing a few words ahead.

1 Like

definitely get the exact cost of an Array

so…what is it? Do you include the expected value of the fill factor of each page (50% average), the fill factor of growable vectors (75% average), the metadata for page allocation (expected value is typically around 1%), the metadata for malloc (not used by Julia for the array itself due to its inherent slowness and overhead, but used for some metadata structures and adds a negligible %)? How about the system page tables and kernel data structures to manage those? Do you count the cost of allocating all of the code that is running to allocate that array (we’re talking about real exact numbers, not some theoretical computing machine with infinite resources, right?). Even without any kernel, would you include the ECC overhead? Since the array will be copied into L1, L2, and L3, does that mean its size should be multiplied by 3 if it has been recently accessed?

Accounting for the correct cost of the header (40 bytes sounds about right) is sane. Referencing O() as if it had an exact value is not even done in theoretic courses.

1 Like

I don’t think it’s really that complicated.
You have the memory taken for the array header itself, and any padding used to keep it aligned to whatever the Julia allocator for those headers is (as I asked before, is that 40 bytes, 48 bytes, or 64 bytes)?
Then you have the memory pointed to by the data field, how much is allocated and no longer available for other allocations (i.e. including any header or alignment overhead. If the array is currently using 67 bytes, but 256 is allocated (for alignment or future growth, including header if any) then that is what should added to the amount taken for the header size.

When I’ve seen good reports of memory usage, that’s the sort of calculation I’ve seen.

Talking about cache lines, ECC overhead, page tables, size of code etc. doesn’t seem very useful.

As the originator of this topic, I can tell you what I’d like to be able to do: provide guidance to users on how much memory they can expect their graph structure to take given an expected number of nodes and edges. Ideally, it would be something like:

Undirected Int64 graphs will allocate 1 byte per node and 8 bytes per edge; directed graphs will allocate 2 bytes per node and 8 bytes per edge

or whatever’s correct. Right now we simply have no way of measuring what space an empty million-node graph occupies.

The most pragmatic answer to getting the “true memory footprint” is to measure it as it gets created with the @allocated macro. This asks the GC directly how many bytes it allocated while evaluating a given expression:

julia> @allocated Vector{Int}(1)
96

This includes the headers and alignment. The actual alignment amount varies — there are separate demands for the type tag, the header, and the data segment. It depends upon the element size and the number of elements. Small arrays are stored inline right next to the header and that inline data is 16-byte aligned. Large arrays are stored in a separate blob that is 64-byte aligned. That separate blob has its own GC header.

Doing this retrospectively is much harder — did the vector once live inline alongside the header before getting resized? Then the header’s memory segment is bigger than usual, but that’s not really feasible to figure out.

I imagine the no longer needed space after the header would either be returned, or still accounted for in the header, otherwise that memory would end up being leaked, so I think it has to be possible to figure out if any extra memory is still allocated in that case.

It cannot be independently freed because it was allocated all at once. While the array header has no knowledge of the extra space, it’s not a leak since the GC is still tracking it. I imagine the GC might track how large each memory region is, but doing so wouldn’t be strictly necessary. This was just one example of how the array itself doesn’t know how much memory it takes up.

That’s interesting. I hadn’t delved deep enough into Julia’s allocation scheme to see that limitation (that memory cannot be independently freed). The allocator I had for small allocations (in Caché) used a buddy block scheme, and did free memory if possible when an array or string was made smaller. For example, if string was 20 characters long, it would fit into a 32-byte block, but if it got set to be say 12 characters long, which could fit in a 16-byte block, the second half was added to a list of free 16-byte blocks.

That’s something to keep in mind, since it will eat up more memory that you might think for, if you allocate a vector that is a decent size but still inlined, and then make it larger where it crosses that threshold. You’d to better to use sizehint! if you expect the size to be over the threshold (whatever that is)

1 Like

I just tracked down that threshold, ARRAY_INLINE_NBYTES, in src/options.h, which is set to 2048sizeof(void), so 16KB on 64-bit platforms, and from the comment:

// how much space we’re willing to waste if an array outgrows its
// original object

it seems it is the case that that memory will not be reclaimed (until the array itself is).
Good information to have. Thanks for pointing that out!