Why is sparse * dense slower than dense * dense?
x = sprand(1000,1000,.1); fx = full(x); y = rand(1000,1000); @time x*y; 0.925036 seconds (31.94 k allocations: 9.071 MB) @time x*y; 0.885278 seconds (7 allocations: 7.630 MB) @time fx*y; 0.548416 seconds (398.70 k allocations: 21.551 MB, 11.83% gc time) @time fx*y; 0.086072 seconds (7 allocations: 7.630 MB, 59.55% gc time)
Reading #1831, I also tried a different sparse type
xa = Base.SparseArrays.CHOLMOD.Sparse(x); @time xa*y; 0.147444 seconds (51.25 k allocations: 17.378 MB) @time xa*y; 0.097434 seconds (5.47 k allocations: 15.518 MB, 5.12% gc time)
which is still not faster than dense * dense above (only for first call) but has vastly more allocations.
So, why is sparse * dense slower than dense * dense? Although I would also like to see an explanation, my primary interest is a convient way to exploit O(N^2) scaling of sparse multiplications over O(N^3) scaling for dense matrix multiplications.
To me this seems to be a highly important issue for many applications.