Hello. It so happens that I need to normalize sparse (real) matrices from time to time. Moving to v0.7 (and later), I have come across some surprising behaviour involving Diagonal (sparse) matrices.
This example illustrates the point (of course, in real life, I do not multiply small, sparse identity matrices, but the performance drop is the same):
using SparseArrays
using LinearAlgebra
using Profile
function diag_times_sparse(nrows, pswitch)
spmat = sparse(1.0*I,nrows, nrows)
diagmat = Diagonal(diag(spmat))
if pswitch > 1
@profile resmat = diagmat*spmat*diagmat
Profile.print(C=true, sortedby=:count)
else
@time resmat = diagmat*spmat*diagmat
end
return resmat
end
function sparse_times_sparse(nrows)
spmat1 = sparse(1.0*I,nrows, nrows)
spmat2 = sparse(1.0*I,nrows, nrows)
@time resmat = spmat2*spmat1*spmat2
return resmat
end
diag_times_sparse(5,1)
sparse_times_sparse(5)
diag_times_sparse(1000,1)
#diag_times_sparse(1000,2) if one wants profile data
sparse_times_sparse(1000)
On my machine (julia 0.6.4, Linux x86_64, binary distribution), I get as output
0.000016 seconds (8 allocations: 864 bytes)
0.187978 seconds (35.96 k allocations: 1.816 MiB)
0.000032 seconds (8 allocations: 47.844 KiB)
0.000103 seconds (20 allocations: 158.656 KiB)
the first two lines being the calls to get the functions compiled.
With v0.7 (again, plain vanilla binary download), I get:
0.000019 seconds (12 allocations: 1.063 KiB)
0.146651 seconds (175.73 k allocations: 8.672 MiB)
1.675994 seconds (14 allocations: 48.172 KiB)
0.000080 seconds (18 allocations: 158.469 KiB),
and finally, 1.3rc2 (plain vanilla binary distribution):
0.000022 seconds (18 allocations: 1.313 KiB)
0.000598 seconds (16 allocations: 1.438 KiB)
3.462305 seconds (22 allocations: 48.531 KiB)
0.001336 seconds (16 allocations: 122.250 KiB)
In v0.7, the profiler shows that one spends oneās time in the interpreter, even during the second call.
A hand-rolled normalization function works just fine.
Hope this helps to put someone on the right track.
Keep the impressive work going!