Very sparse matrix but still out-of-memory?


#1

On Julia 0.6.3, a very sparse matrix with 1000 rows and 100000 columns with only 10 entries is multiplied by a vector column-wise:

sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)

This throws:

OutOfMemoryError()
Stacktrace:
[1] _allocres at .\sparse\higherorderfns.jl:168 [inlined]
[2] _diffshape_broadcast(::Base.#, ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at .\sparse\higherorderfns.jl:132
[3] broadcast(::Base.#
, ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at .\sparse\higherorderfns.jl:121

I am surprised by this, since the resulting sparse matrix will also have only 10 non-zeros. Am I missing something?


#2

Did you check with a smaller example that the result is indeed a SparseMatrixCSC and not a Matrix?


#3

Yes:

@time sparse(rand(1:100, 10), rand(1:10000, 10), rand(10), 100, 10000) .* rand(100, 1)

0.004960 seconds (117 allocations: 15.500 MiB, 61.64% gc time)

100×10000 SparseMatrixCSC{Float64,Int64} with 10 stored entries:
[29 , 870] = 0.0375947
[98 , 962] = 0.0684546
[21 , 2043] = 0.00327209
[57 , 2677] = 0.280373
[39 , 5550] = 0.12388
[22 , 5820] = 0.132531
[6 , 6948] = 0.0959543
[94 , 7515] = 0.486234
[23 , 7848] = 0.0197313
[64 , 8926] = 0.148429


#4

I can’t reproduce the error:

julia> z = sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)
1000×100000 SparseMatrixCSC{Float64,Int64} with 10 stored entries:
  [981   ,  19202]  =  0.0309323
  [188   ,  23036]  =  0.619155
  [13    ,  49876]  =  0.0645187
  [675   ,  53900]  =  0.127399
  [672   ,  58168]  =  0.760207
  [865   ,  59250]  =  0.313583
  [437   ,  60015]  =  0.0336611
  [454   ,  62628]  =  0.259509
  [650   ,  63567]  =  0.118465
  [91    ,  75675]  =  0.0690428

julia> whos()
                          Base               Module
                          Core               Module
                          Main               Module
                           ans    781 KB     1000×100000 SparseMatrixCSC{Float…
                             z    781 KB     1000×100000 SparseMatrixCSC{Float…
julia> versioninfo()
Julia Version 0.6.3
Commit d55cadc350 (2018-05-28 20:20 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin17.5.0)
  CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

Looks like you’re using Windows, though, so perhaps that’s the difference here.


#5

this is working fine for me on Windows and Julia 0.6.3


#6

On 0.6.3, I don’t run out of memory, but memory consumption definitely seems way too high:

julia> using BenchmarkTools

julia> @btime sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1);
  272.095 ms (32 allocations: 1.49 GiB)

The allocations happen during the broadcast call. I’m getting the same behavior on latest Julia nightly.

Edit: it appears that maxnnzC is calculated to be 100000000 in _diffshape_broadcast. This seems like a bit of an overestimate. You should probably open an issue.


#7

Thank you, done!

Maybe you have more RAM than me… Increasing the number of columns will at some point throw the error, though.