I created two sharedmatrix as the below
sm1 = SharedMatrix{Float64}( rand(1000, 1000) )
sm2 = SharedMatrix{Float64}( rand(1000, 1000) )
sm3 = sm1 * sm2 ---- (1)
However, sm3 is of type Matrix{Float64} not of type SharedMatrix{Float64}
So I have tried like the below
sm3 = SharedMatrix{Float64}(sm1*sm2) ----- (2)
But, this still has a performance problem.
(2) is much slower than (1).
I use version 0.6.2
Does anyone help me? I want to find a way that returns the result of type SharedMatrix and as fast as (1) do.
2 Likes
The conversion from SharedArray
to Array
happens because of the call to BLAS, but I think it should be converted back before returning. However performance-wise, the following 2 functions are very similar:
julia> using BenchmarkTools
julia> function f1(a::SharedMatrix{T}, b::SharedMatrix{S}) where {T, S}
c = a*b
return SharedMatrix{promote_type(T,S)}(c)
end
f1 (generic function with 1 method)
julia> function f2(a::SharedMatrix{T}, b::SharedMatrix{S}) where {T, S}
c = a*b
return c
end
f2 (generic function with 1 method)
julia> sm1 = SharedMatrix{Float64}(rand(1000,1000));
julia> sm2 = SharedMatrix{Float64}(rand(1000,1000));
julia> @btime sm3 = f1($sm1, $sm2);
86.727 ms (835 allocations: 7.66 MiB)
julia> @btime sm4 = f2($sm1, $sm2);
85.523 ms (2 allocations: 7.63 MiB)
julia> sm1 = SharedMatrix{Float64}(rand(10000,10000));
julia> sm2 = SharedMatrix{Float64}(rand(10000,10000));
julia> @time sm3 = f1(sm1, sm2);
51.399829 seconds (813 allocations: 762.967 MiB, 0.14% gc time)
julia> @time sm4 = f2(sm1, sm2);
50.057509 seconds (6 allocations: 762.940 MiB, 0.12% gc time)
Edit: when you convert, the array is actually being copied so yes I agree it might be a performance problem. I hope someone more knowledgeable will give you a better answer.