I am trying to multiply every element of a sparse matrix A (aij) with x[i]*x[j] from vector X.
aij *= (x[i]*x[j]).
In other words, Diagonal(X)*A*Diagonal(X). It will be simple to just compute as that but it takes 2X time since it iterates A twice by 2 multiplication. Then I made a function as following to update the value of the matrix by single round of iteration.
function _norm!(A, x) m, n = size(A) rows = rowvals(A) vals = nonzeros(A) @inbounds @simd for i = 1:n for j in nzrange(A, i) vals[j] *= x[rows[j]]*x[i] end end nothing end
#testing using SparseArrays, SharedArrays using Distributed n = 10^6 A = sprand(n,n,0.0001) x = randn(Float64, n) _norm!(A, x) @time _norm!(A,x) #0.591912 seconds (4 allocations: 160 bytes)
I try to parallel process the columns. But the function I coded is slower than single processor.
function _pnorm!(shared, A, x) m, n = size(A) rows = rowvals(A) @sync @distributed for i = 1:n @inbounds @simd for j in nzrange(A, i) shared[j] *= x[rows[j]]*x[i] end end nothing end
#testing y = SharedArray(A.nzval) addprocs(3) @everywhere using SparseArrays _pnorm!(y, A, x) @time _pnorm!(y, A, x) #2.424550 seconds (626 allocations: 34.734 KiB)
This is the first time I try the parallel feature. I am very grateful for your help