Size of Multi-dimensional SharedArray

bug

#1

Is it a bug with Base.summarysize or something very inefficient about SharedArray? A 1D SharedArray only have a small overhead but a 2D SharedArray uses 2x more memory?

julia> Base.summarysize(rand(1000000))
8000000

julia> Base.summarysize(SharedArray{Float64}(rand(1000000)))
8000213

julia> Base.summarysize(rand(1000000,2))
16000000

julia> Base.summarysize(SharedArray{Float64}(rand(1000000,2)))
32000221


#2

I’d take anything that summarysize says with a large grain of salt.
It doesn’t count the header (40 bytes on a 64-bit platform), nor the actual amount allocated, just the sizeof the vector (what is used).


#3

Thanks for the insight. Do you know of any good way to accurately measure how much memory it is using? I have a use case that I could load up to 100 GiB into shared memory and it would be a bummer if it fails miserably for my lack of understanding :slight_smile:


#4

I am getting the following on v0.6.2:

julia> Base.summarysize(rand(1000000))
8000000

julia> Base.summarysize(SharedArray{Float64}(rand(1000000)))
8000341

julia> Base.summarysize(rand(1000000,2))
16000000

julia> Base.summarysize(SharedArray{Float64}(rand(1000000,2)))
16000349

There is also Base.shmem_rand btw.


#5

Interesting, what OS/platform are you using? I also have v0.6.2:

$ julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-apple-darwin14.5.0

julia> Base.summarysize(SharedArray{Float64}(rand(1000000,2)))
32000221

#6

Good old Windows.

julia> versioninfo()
Julia Version 0.6.2
Commit d386e40c17* (2017-12-13 18:08 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

#7

On Linux with v0.6.1, I am getting the same result as you. But on nightly, getting the right result.

On Windows with latest nightly (downloaded today), I am getting:

julia> using SharedArrays

julia> Base.summarysize(SharedArray{Float64}(rand(1000_000,2)))
32000205

This really seems like a bug.


#8

I think it’s just double counting, because of the behavior of summarysize. It basically recursively counts how much memory is used by all objects reachable. A SharedArray has a field loc_subarr_1d and another s. The latter holds the whole array and the former holds a 1d view of the array. Mutating one mutates the other, so it’s just double counting. It’s probably doing something like:

mysizeof(f) = sum((sizeof(f), (mysizeof(getfield(f,i)) for i in 1:nfields(f))...))

But it’s weird that it’s not double counting for the 1D case :confused:. Anyways, I think it deserves some attention from devs.

Maybe adding a bug keyword or something will get more attention more quickly. In the meantime, I think the equivalent of task manager in Windows or system monitor in Linux will give you a rough idea of memory used by large enough data.


#9

If you haven’t already, you should consider opening an issue so that it does not get lost.


#10

Done https://github.com/JuliaLang/julia/issues/25367.