Bug in `sizeof` for nested arrays

dataSurfer · April 29, 2020, 12:57pm

Just checking before I raise an issue in Julia repo, this is a bug right?

sizeof([[rand(Float64) for j in 1:10] for i in 1:10])
# 80

sizeof([[rand(Float32) for j in 1:10] for i in 1:10])
# 80

sizeof(rand(Float64, (10, 10)))
# 800

sizeof(rand(Float32, (10, 10)))
# 400

$ julia --version
julia version 1.4.1

kristoffer.carlsson · April 29, 2020, 1:21pm

No, it contains 10 pointers (to inner vectors) à 8 bytes each = 80 bytes.

dataSurfer · April 29, 2020, 1:28pm

That’s very interesting and certainly unexpected but when I think about it it makes sense. Thank you.

affans · April 29, 2020, 1:29pm

What you might be looking for is Base.summarysize.

julia> sizeof([[rand(Float32) for j in 1:10] for i in 1:10])
80

julia> Base.summarysize([[rand(Float32) for j in 1:10] for i in 1:10])
920

dataSurfer · April 29, 2020, 1:33pm

Now I’m confused.

affans · April 29, 2020, 2:32pm

help?> Base.summarysize
  Base.summarysize(obj; exclude=Union{...}, chargeall=Union{...}) -> Int

  Compute the amount of memory, in bytes, used by all unique objects reachable
  from the argument.

Not sure what is stored internally. I think it might be calculating the size of the data (800 bytes) + the size of the pointers storing the data + maybe some other housekeeping item? I don’t know.

dataSurfer · April 29, 2020, 3:39pm

According to Base.summarysize(), using nested arrays more than doubles the size of the object, even the 100 x 100 Float32 array contains 40 bytes of “gubbings”?

sizeof(rand(Float32, (10, 10)))
# 400

Base.summarysize(rand(Float32, (10, 10)))
# 440

Base.summarysize([[rand(Float32) for j in 1:10] for i in 1:10])
# 920

Is that right? If so it’s quite serious.

mschauer · April 29, 2020, 3:57pm

Why? It’s just 5 Int’s, for example a pointer to the data, the number of dimensions, the length of that dimension…

dataSurfer · April 29, 2020, 4:11pm

True and as the array size grows the 40 bytes becomes insignificant. I’ll just be aware that creating nested arrays is much less memory efficient. It’s not something I previously gave much thought to.

kristoffer.carlsson · April 29, 2020, 4:18pm

The bigger problem with nested arrays when it comes to performance is how the data is laid out in memory. With a Matrix all the data is stored continuously while for nested arrays you have pointers to columns (rows) that can be at completely different places in memory and accessing these can be quite penalising (see biojulia.net).

pixel27 · April 29, 2020, 4:18pm

I’m not sure if this will help or hurt…

import Base.summarysize

mutable struct Bar
    a::Int64
    b::Int64
    c::Int64
    Bar() = new(0, 0, 0)
end
bar = Bar()
print("sizeof(bar)       = $(sizeof(bar))\n")
print("summarysize(bar)  = $(summarysize(bar))\n")

struct Foo
    a::Int64
    b::Union{Bar, Nothing}
    Foo() = new(0, nothing)
    Foo(a, b) = new(a, b)
end

foo = Foo()
print("sizeof(foo)       = $(sizeof(foo))\n")
print("summarysize(foo)  = $(summarysize(foo))\n")

foo2 = Foo(0, Bar())
print("sizeof(foo2)      = $(sizeof(foo2))\n")
print("summarysize(foo2) = $(summarysize(foo2))\n")

Processes the following output:

sizeof(bar)       = 24
summarysize(bar)  = 24
sizeof(foo)       = 16
summarysize(foo)  = 16
sizeof(foo2)      = 16
summarysize(foo2) = 40

Bar is 24 bytes, three 8 byte values for a total of 24 bytes, both sizeof and summarysize report the same amount.

Foo can have a child Bar or it could have nothing, foo is it without a Bar, and foo2 is with a Bar. You will notice that sizeof(foo) == sizeof(foo2). The amount of bytes that structure needs is 16 bytes of allocated space whether Bar is there or not. When Foo HAS a bar then summarysize reports 40 bytes which is 16 bytes for Foo and 24 bytes for Bar. summarysize is recursive looking at any child objects and adding their bytes to the total.

Does that make a little more sense? The sizeof an Array of Arrays is the amount of memory need to reference all the child arrays. The summarysize includes the sizes of the referenced arrays.

dataSurfer · April 29, 2020, 4:24pm

I’m hoping that using nested arrays will be better for parallelization, aren’t nested arrays better for access by multiple processes? Could multiple threads simultaneously read data from a contiguous array at once?

kristoffer.carlsson · April 29, 2020, 4:28pm

Sure.

If the threads write back to the array it is good to keep False sharing - Wikipedia in mind though.

dataSurfer · April 29, 2020, 4:36pm

All I can see is that false sharing means that attempts to write to memory that is being read will cause problems, but what does it mean for concurrent data reads? As far as I sort of know when data is read in an array, the pointer has to be moved to that point, if multiple threads are attempting to read from the same chunk of memory, the pointer won’t be shared … doesn’t each process have to wait till the one having the pointer has finished? I’m only trying to read data not write to the array. That’s why I’m interested in nested arrays.

ssfrr · April 29, 2020, 4:42pm

The array is a blob of memory and that memory is shared between two threads (i.e. all the threads are able to access the same memory). I’m not sure what you mean by “the pointer”. Each thread can be running code that reads and/or writes whatever memory it has access to. It’s up to you to make sure that they’re not reading and writing the exact same memory addresses concurrently, but if the idea is to do some processing in parallel than generally you’ll have each thread working with a different part of the array (a different range of addresses).

kristoffer.carlsson · April 29, 2020, 4:45pm

False sharing is when multiple threads write to memory locations that are near each other in memory and due to how CPUs are structured this means that they need to “refresh” their view of their data which takes some time. But if you are not writing then there is no risk for false sharing.

dataSurfer · April 29, 2020, 4:47pm

I’ve read that arrays are a pointer + a memory block that the pointer can traverse and read/write at each point in that block (I’ve also written some code in languages like D and C++ that confirms this - though I’m no expert at pointer programming). I’ve always assumed that when you access an array you are moving that pointer to a specific set of points in the array and reading or writing. Are you saying that if I had a 12 core CPU (for sake of argument) all the cores in that CPU could simultaneously read from the same memory block in the array at the same time?

dataSurfer · April 29, 2020, 5:22pm

I guess I’m also asking multiple reads slow down the process or not? … it’s clear that there’s nothing “bad” about it, but does performance surfer from it?

dataSurfer · April 29, 2020, 5:56pm

I guess read only doesn’t incur any cost on the same array, in fact it looks as if there is a performance penalty for nested array as stated before:

import BenchmarkTools
function fun(x, y)
  T = typeof(x)
  if (x == y) && (x == 0)
    return 0
  end
  return T(-20*exp(-0.2*sqrt(0.5*(x^2 + y^2))) - exp(0.5*(cos(pi*2*x))) + exp(1) + 20)
end


import Base.Threads: @threads
function pFun1(z)
  n, p = size(z)
  @threads for j in 1:p
    for i in 1:(n - 1)
      fun(z[i, j], z[i + 1, j])
    end
  end
end

function pFun2(z)
  p = size(z)[1]; n = size(z[1])[1]
  @threads for i in 1:p
    for j in 1:(n - 1)
      fun(z[i][j], z[i][j + 1])
    end
  end
end

n = 1000_000; p = 48
mData = rand(Float32, (n, p));
pData = [rand(Float32, n) for i in 1:p];

# Warmup
pFun1(mData);
pFun2(pData);

mData = rand(Float32, (n, p));
pData = [rand(Float32, n) for i in 1:p];

@btime pFun1(mData)
@btime pFun2(pData)

#julia> @btime pFun1(mData)
#  232.162 ms (64 allocations: 8.63 KiB)
#julia> @btime pFun2(pData)
#  251.224 ms (64 allocations: 8.63 KiB)

Topic		Replies	Views
Size of Multi-dimensional SharedArray New to Julia bug	9	991	January 3, 2018
`sizeof` on custom struct does not check fields General Usage struct	4	236	June 30, 2024
Total memory size of collections General Usage	1	307	November 3, 2022
Nested static arrays perform worse than nested arrays Performance question , package	14	549	November 12, 2021
Threads.@threads slow down when arrays become too large New to Julia	1	898	November 29, 2021

Bug in `sizeof` for nested arrays

Related topics