Very sparse matrix but still out-of-memory?

On Julia 0.6.3, a very sparse matrix with 1000 rows and 100000 columns with only 10 entries is multiplied by a vector column-wise:

sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)

This throws:

OutOfMemoryError()
Stacktrace:
[1] _allocres at .\sparse\higherorderfns.jl:168 [inlined]
[2] _diffshape_broadcast(::Base.#, ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at .\sparse\higherorderfns.jl:132
[3] broadcast(::Base.#
, ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at .\sparse\higherorderfns.jl:121

I am surprised by this, since the resulting sparse matrix will also have only 10 non-zeros. Am I missing something?

Did you check with a smaller example that the result is indeed a SparseMatrixCSC and not a Matrix?

Yes:

@time sparse(rand(1:100, 10), rand(1:10000, 10), rand(10), 100, 10000) .* rand(100, 1)

0.004960 seconds (117 allocations: 15.500 MiB, 61.64% gc time)

100×10000 SparseMatrixCSC{Float64,Int64} with 10 stored entries:
[29 , 870] = 0.0375947
[98 , 962] = 0.0684546
[21 , 2043] = 0.00327209
[57 , 2677] = 0.280373
[39 , 5550] = 0.12388
[22 , 5820] = 0.132531
[6 , 6948] = 0.0959543
[94 , 7515] = 0.486234
[23 , 7848] = 0.0197313
[64 , 8926] = 0.148429

I can’t reproduce the error:

julia> z = sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)
1000×100000 SparseMatrixCSC{Float64,Int64} with 10 stored entries:
  [981   ,  19202]  =  0.0309323
  [188   ,  23036]  =  0.619155
  [13    ,  49876]  =  0.0645187
  [675   ,  53900]  =  0.127399
  [672   ,  58168]  =  0.760207
  [865   ,  59250]  =  0.313583
  [437   ,  60015]  =  0.0336611
  [454   ,  62628]  =  0.259509
  [650   ,  63567]  =  0.118465
  [91    ,  75675]  =  0.0690428

julia> whos()
                          Base               Module
                          Core               Module
                          Main               Module
                           ans    781 KB     1000×100000 SparseMatrixCSC{Float…
                             z    781 KB     1000×100000 SparseMatrixCSC{Float…
julia> versioninfo()
Julia Version 0.6.3
Commit d55cadc350 (2018-05-28 20:20 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin17.5.0)
  CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

Looks like you’re using Windows, though, so perhaps that’s the difference here.

this is working fine for me on Windows and Julia 0.6.3

On 0.6.3, I don’t run out of memory, but memory consumption definitely seems way too high:

julia> using BenchmarkTools

julia> @btime sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1);
  272.095 ms (32 allocations: 1.49 GiB)

The allocations happen during the broadcast call. I’m getting the same behavior on latest Julia nightly.

Edit: it appears that maxnnzC is calculated to be 100000000 in _diffshape_broadcast. This seems like a bit of an overestimate. You should probably open an issue.

Thank you, done!

Maybe you have more RAM than me… Increasing the number of columns will at some point throw the error, though.