Hi,
I am doing a matrix-matrix multiplication A(12870*11440)*B(11440*11440)
using mkl_dgemm
library. In c language, The whole calculation takes about 20 sec but if I call the mkl_dgemm
using Julia’s ccall it takes 42.274859 sec. Inbuilt mul!(c,a,b)
do the same calculation in 21.586031 seconds. I can’t understand how to improve these extra 20 seconds. I am attaching a part of my code.
julia> @time mul!(c,a,b);
21.586031 seconds
julia> @time ccall(("dgemm", libmkl_rt), Cvoid,
(Ref{UInt8}, Ref{UInt8}, Ref{BlasInt}, Ref{BlasInt},
Ref{BlasInt}, Ref{Float64}, Ptr{Float64}, Ref{BlasInt},
Ptr{Float64}, Ref{BlasInt}, Ref{Float64}, Ptr{Float64},
Ref{BlasInt}),
'N', 'N', 12870, 11440, 11440, 1.0, a, 12870, b, 11440, 0.0, c, 12870)
42.274859 seconds (3 allocations: 48 bytes)