On Julia 0.6.3, a very sparse matrix with 1000 rows and 100000 columns with only 10 entries is multiplied by a vector column-wise :
sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)
This throws:
OutOfMemoryError()
Stacktrace:
[1] _allocres at .\sparse\higherorderfns.jl:168 [inlined]
[2] _diffshape_broadcast(::Base.#, ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at .\sparse\higherorderfns.jl:132
[3] broadcast(::Base.# , ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at .\sparse\higherorderfns.jl:121
I am surprised by this, since the resulting sparse matrix will also have only 10 non-zeros. Am I missing something?
Did you check with a smaller example that the result is indeed a SparseMatrixCSC
and not a Matrix
?
Yes:
@time sparse(rand(1:100, 10), rand(1:10000, 10), rand(10), 100, 10000) .* rand(100, 1)
0.004960 seconds (117 allocations: 15.500 MiB, 61.64% gc time)
100×10000 SparseMatrixCSC{Float64,Int64} with 10 stored entries:
[29 , 870] = 0.0375947
[98 , 962] = 0.0684546
[21 , 2043] = 0.00327209
[57 , 2677] = 0.280373
[39 , 5550] = 0.12388
[22 , 5820] = 0.132531
[6 , 6948] = 0.0959543
[94 , 7515] = 0.486234
[23 , 7848] = 0.0197313
[64 , 8926] = 0.148429
MaximilianJHuber:
sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)
I can’t reproduce the error:
julia> z = sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1)
1000×100000 SparseMatrixCSC{Float64,Int64} with 10 stored entries:
[981 , 19202] = 0.0309323
[188 , 23036] = 0.619155
[13 , 49876] = 0.0645187
[675 , 53900] = 0.127399
[672 , 58168] = 0.760207
[865 , 59250] = 0.313583
[437 , 60015] = 0.0336611
[454 , 62628] = 0.259509
[650 , 63567] = 0.118465
[91 , 75675] = 0.0690428
julia> whos()
Base Module
Core Module
Main Module
ans 781 KB 1000×100000 SparseMatrixCSC{Float…
z 781 KB 1000×100000 SparseMatrixCSC{Float…
julia> versioninfo()
Julia Version 0.6.3
Commit d55cadc350 (2018-05-28 20:20 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin17.5.0)
CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)
Looks like you’re using Windows, though, so perhaps that’s the difference here.
this is working fine for me on Windows and Julia 0.6.3
On 0.6.3, I don’t run out of memory, but memory consumption definitely seems way too high:
julia> using BenchmarkTools
julia> @btime sparse(rand(1:1000, 10), rand(1:100000, 10), rand(10), 1000, 100000) .* rand(1000, 1);
272.095 ms (32 allocations: 1.49 GiB)
The allocations happen during the broadcast call. I’m getting the same behavior on latest Julia nightly.
Edit: it appears that maxnnzC
is calculated to be 100000000
in _diffshape_broadcast
. This seems like a bit of an overestimate. You should probably open an issue.
Thank you, done !
Maybe you have more RAM than me… Increasing the number of columns will at some point throw the error, though.