Weighted sum of sparse matrices

abulak · August 25, 2017, 6:48pm

I have a simple problem:

julia> const DIM = 1000;
julia> const ORDER = 500;
julia> const matrix_reps = [sprand(DIM, DIM, 0.01) for _ in 1:ORDER;
julia> const weights = sprandn(ORDER, 0.1);

I want to compute weighted sum of matrix_reps, i.e. \sum_i weights[i]\cdot matrix\_reps[i].
This is what I tried so far:

julia> function w_sumFUSED(x::AbstractVector{T}, mreps::Vector{TT}) where {S<:Number, T<:Number, TT<:AbstractArray{S}}
           res = spzeros(S, size(first(mreps))...)
           for i in findn(x)
               res .+= S(x[i]).*mreps[i]
           end
           return res
       end
w_sumFUSED (generic function with 1 method)

julia> function w_sumBLAS(x::AbstractVector{T}, mreps::Vector{TT}) where {S<:Number, T<:Number, TT<:AbstractArray{S}}
           res = spzeros(S, size(first(mreps))...)
           for i in findn(x)
               Base.LinAlg.axpy!(S(x[i]), mreps[i], res)
           end
           return res
       end
w_sumBLAS (generic function with 1 method)

julia> function w_sumSUM(x::AbstractVector{T}, mreps::Vector{TT}) where {S<:Number, T<:Number, TT<:AbstractArray{S}}
           return sum(S(x[i])*mreps[i] for i in findn(x))
       end
w_sumSUM (generic function with 1 method)

timings:

julia> function time_everything(vect, mreps)
           w_sumSUM(vect[1:10], mreps[1:10]);
           println("SUM:")
           @time resSUM = w_sumSUM(vect, mreps);

           w_sumBLAS(vect[1:10], mreps[1:10])
           println("BLAS:")
           @time resBLAS = w_sumBLAS(vect, mreps);

           w_sumFUSED(vect[1:10], mreps[1:10])
           println("fused:")
           @time resFUSED = w_sumFUSED(vect, mreps);

           @assert resFUSED == resBLAS
           @assert resFUSED == resSUM
       end
time_everything (generic function with 1 method)

julia> time_everything(weights, matrix_reps)
SUM:
  0.145636 seconds (688 allocations: 151.844 MiB, 47.66% gc time)
BLAS:
 14.278971 seconds (41 allocations: 10.010 MiB)
fused:
 15.503572 seconds (41 allocations: 10.010 MiB)

Question(s):

why is BLAS so slow, despite so little allocations? (checked against @code_typewarn)
why is FUSED so slow, despite so little allocations? (checked against @code_typewarn)
(general) what is the best method for performing arithmetic on sparse matices?

stevengj · August 25, 2017, 8:39pm

axpy! is not calling BLAS here, because BLAS doesn’t operate on sparse matrices.

Summing two sparse matrices is going to be slow if the sparsity patterns don’t match, because it is going to involve reallocating the sparse-matrix data structure and moving a lot of data around. Google how sparse matrices are stored (Julia uses compressed sparse column, but compressed sparse row is similar).

It will be faster if you precompute the sparsity pattern of the result, which is what I think sum may be doing?

abulak · August 26, 2017, 10:03am

Thanks for the answer,
I didn’t expect inserting new values is so expensive, As far as I understand summing sparse matrices (when sparsity patterns don’t match) involves insertion into nzvals, rowvals and incrementing colptr.

Also: after changing res = spzeros(S, size(first(mreps))...) to dense res = zeros(S, size(first(mreps))...) (and thus avoiding the problem You pointed out?)
BLAS and FUSED are faster, but still not as fast as SUM:

SUM:
  0.083839 seconds (943 allocations: 265.821 MiB, 6.02% gc time)
BLAS:
  0.791856 seconds (3 allocations: 7.630 MiB)
fused:
  0.759364 seconds (3 allocations: 7.630 MiB, 0.04% gc time)

stevengj · August 27, 2017, 11:35am

How do you think you insert a value into the middle of an array without reallocating and moving everything in the array?

I don’t think broadcast operations on mixed combinations of sparse and dense arrays are particularly fast. It might be interesting to try a variant where you allocate a dense output array, but use explicit loops over the nonzero entries to do the sums.

Topic		Replies	Views
Speed comparison: Sparse matrix multiplication vs usual matrix multiplication General Usage question	7	3958	February 13, 2017
Sparse matrix-vector product: much more slow than Matlab Performance matlab , optimization	24	4645	December 20, 2017
Performance discrepancy in sparse matrix product Performance	4	1008	June 29, 2018
Adding to sparse matrix efficiently and IterativeSolvers.gmres speed vs MATLAB General Usage	3	1408	March 1, 2019
Spase by dense matrix product General Usage	0	475	September 23, 2017

Weighted sum of sparse matrices

Related topics