Understanding AbstractFloat Arrays

Luis_Alberto_R · February 27, 2025, 8:08am

I’m having a hard time understanding the size (in bytes) of Arrays with AbstractFloat types because of the following:

julia> ar1 = Array{AbstractFloat,3}(rand(50,50,50));

julia> ar2 = Array{Float64,3}(rand(50,50,50));

julia> Base.summarysize(ar1)
2000056

julia> Base.summarysize(ar2)
1000056

Probably this is where I’m wrong, but I had the impression that the supertype AbstractFloat is sort of like Union{BigFloat,Float64,Float32,Float16}. So if you create an array with any of the two, you should be able to fill that array with any type of float. So, since BigFloat is 40 bytes, the size of the array should be, at most, 40*50*50*50 (+ some overhead). In my example, since rand() generates Float64’s then both arrays should be 8*50*50*50 = 1 000 000 (+ overhead). My first guess was that the data was being duplicated to have every Float precision, but that doesn’t make sense number wise (or at all really), so if anyone could explain this I would really appreciate.

jakobnissen · February 27, 2025, 8:14am

For the Float64 array ar2, the size is straightforward: 50 * 50 * 50 elements, and a 64-bit float takes 8 bytes, so that’s 1 MB. Then 56 bytes of overhead for the array itself.

For the AbstractFloat array ar1, it’s not possible to determine the size of every element in the array. The user can always define a new type MyType <: AbstractFloat with whatever size the user wants. Therefore, Julia needs to store every float individually on the heap, and then fill the array itself with pointers to those floats on the heap.
So, we have 50 * 50 * 50 = 125,000 floats. Each of these take up 8 bytes on the heap (plus some overhead which is somehow not counted) for 1 MB total, and then the array itself takes up 50 * 50 * 50 8-byte pointer for another 1 MB total. Then the 56 byte overhead for the array.

Luis_Alberto_R · February 27, 2025, 8:34am

Oh, I see. So there are two arrays, one with the data and one with the pointers to each data element. That’s why the size doubled, if ar1 was filled with, let’s say, Float32 then Base.summarysize(ar1) would show 1,500,000 plus the overhead. Thank you for answering.

Benny · February 27, 2025, 8:49am

Is there an old issue somwhere about summarysize not counting the boxing overhead of abstractly typed fields or elements, at minimum specifying the runtime concrete type?

stevengj · February 27, 2025, 12:49pm

No, there is only one array of pointers to 50^3 individually allocated Float64 objects on the heap.

(It is necessary to allocate them individually because they could all be different sizes, and this could change at any time. For example, imagine what would happen if you randomly chose 50% of the elements and replaced them with Float32 values.)

Luis_Alberto_R · February 27, 2025, 6:49pm

Then that’s why there’s no overhead of two arrays in Base.summarysize(ar1). summarysize is counting the bytes correctly

stevengj · February 27, 2025, 9:48pm

Arguably it should count the bytes for the type tags (every AbstractFloat allocated on the heap also has a pointer to a type), but this doesn’t seem to be counted.

Topic		Replies	Views
How does AbstractFloat[1.0f0] work? New to Julia	5	888	January 14, 2021
Memory allocation of user defined type General Usage	5	783	May 9, 2018
Very large arrays General Usage	5	1046	March 16, 2022
BigFloat performance, allocates like crazy! General Usage	10	4393	February 7, 2018
Unitful abstract types as arguments to function New to Julia	3	383	June 4, 2020

Understanding AbstractFloat Arrays

Related topics