Call Intel MKL as external library

Tecnezio · February 14, 2017, 1:30pm

I wrote a script to test a call to Intel MKL a external library but it doesn’t work.
The concept: Solve a linear system using MKL “dgetrs” (Documentation Library)

# Test MKL solve linear systems

N = 10;
A = rand(N,N);
AA = copy(A);
b = rand(N,1);
c = copy(b);

function juliaSol(A,b)
 \(A,b);
end
sol = sol = juliaSol(A,b);  #Solution of the linear system

# Test MKL
const global librt = Libdl.find_library(["libmkl_rt"], ["/opt/intel/mkl/lib"])

# Open librt
Libdl.dlopen(librt)

# 101  = LAPACK_ROW_MAJOR
# 102 = LAPACK_COL_MAJOR
function linSOLVE(A::StridedMatrix{Float64}, b::StridedMatrix{Float64})
  m, n = size(A)
  lda  = max(1,stride(A, 2))
  ipiv = similar(A, Int64, min(m,n))

  #lapack_int LAPACKE_dgetrs (int matrix_layout , char trans , lapack_int n ,
  # lapack_int nrhs , const double * a , lapack_int lda ,
  # const lapack_int * ipiv , double * b , lapack_int ldb );
  dd = ccall(("LAPACKE_dgetrs", librt),
              Cint, # Return type
                (Int64, Cuchar, Int64,
                Int64,  Ptr{Float64}, Int64,
                Ptr{Int64},  Ptr{Float64}, Int64),
                102, 'N', m,
                1, A, lda,
                ipiv, b, m
                )
  dd, b, ipiv # dd = 0 if calculation is done, b = output
end

dd, sol2, piv = linSOLVE(AA,c)

The script run with an error: Intel MKL ERROR: Parameter 6 was incorrect on entry to DLASWP
and the solution of the linear system is completely wrong.

An example of ccal of a BLAS can be seen here: https://github.com/JuliaLang/julia/blob/master/base/linalg/lapack.jl#L528

There is any expert (or developer) that can help me to use correctly the MKL ccall ?

tkelman · February 14, 2017, 1:35pm

Do you have MKL set to use 64 bit integers (ILP64)?

Tecnezio · February 14, 2017, 2:04pm

No, i haven’t set anything…

But i loaded the wrong version of the script… and now it work!
Here the working version:

# Test MKL solve linear systems

N = 10000;
A = rand(N,N);
AA = copy(A);
b = rand(N,1);
c = copy(b);

function juliaSol(A,b)
 \(A,b);
end
@time sol = juliaSol(A,b);

# Test MKL
const global librt = Libdl.find_library(["libmkl_rt"], ["/opt/intel/mkl/lib"])

# Open librt
Libdl.dlopen(librt)
# LU FACORIZATION
function luFactMKL(A::StridedMatrix{Float64})
  m, n = size(A)
  lda  = max(1,stride(A, 2))
  ipiv = similar(A, Int64, min(m,n))

# lapack_int LAPACKE_dgetrf (int matrix_layout , lapack_int m , lapack_int n ,
# double * a , lapack_int lda , lapack_int * ipiv );
  dd = ccall(("LAPACKE_dgetrf",  librt), Cint,
            (Int64, Int64, Int64,
            Ptr{Float64}, Int64, Ptr{Int64}),
            102, m, n,
            A, lda, ipiv
  )
  A, ipiv
end

@time AA,pivot = luFactMKL(AA)

# 101  = LAPACK_ROW_MAJOR
# 102 = LAPACK_COL_MAJOR
function linSOLVE(A::StridedMatrix{Float64}, b::StridedMatrix{Float64}, ipiv::StridedVector{Int64})
  m, n = size(A)
  lda  = max(1,stride(A, 2))
  #lapack_int LAPACKE_dgetrs (int matrix_layout , char trans , lapack_int n ,
  # lapack_int nrhs , const double * a , lapack_int lda ,
  # const lapack_int * ipiv , double * b , lapack_int ldb );
  dd = ccall(("LAPACKE_dgetrs", librt),
              Cint, # Return type
                (Int64, Cuchar, Int64,
                Int64,  Ptr{Float64}, Int64,
                Ptr{Int64},  Ptr{Float64}, Int64),
                102, 'N', m,
                1, A, lda,
                ipiv, b, m
                )
  dd, b, ipiv # dd = 0 if calculation is done, b = output
end

@time dd, sol2, piv = linSOLVE(AA,c, pivot)

Using Intel MKL dgetrf + dgtrs julia is as fast as Matlab !

On my PC: with A = rand(10000,10000);

Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
  System: NT (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, broadwell)

Mkl "dgetrf" Factorization time 
 12.180575 seconds (4.16 k allocations: 261.296 KB)

JULIA LAPACK factorization time 
 38.910369 seconds (10 allocations: 78.469 KB)

LAPACK is very memory efficient… but MKL is 3 times faster (and use 3 times more memory)

antoine-levitt · February 14, 2017, 3:03pm

Interesting. Is this with threading on in both OpenBLAS and MKL?

Also, is it possible to use this mechanism to replace dynamically all LAPACK calls from OpenBLAS to MKL?

Tecnezio · February 14, 2017, 3:09pm

I’ve setted anything on both OpenBLAS or MKL. OpenBLAS is as well as it is after a standard installation of Julia, MKL is as well as it is after the MKL intallation on the OS.

OpenBLAS:
ccall((:openblas_get_num_threads64_, Base.libblas_name), Cint, ())
4 # Thread

MKL: MKL automatically tune the number of threads

If you want use MKL for every BLAS function call, the more convenient way is build julia with MKL. In this way every BLAS function will be handled by MKL.

photor · July 17, 2020, 1:30pm

I think calling MKL as external library is safer and more flexible than rebuilding Julia with MKL. I tried to rebuild Julia with MKL once, but it had serious bugs and broke my whole Julia. Does anybody have experience about calling MKL as external library in Julia (>=1.3)?

Elrod · July 17, 2020, 1:52pm

MKL_jll makes it very easy to install and for a project to depend on it.

I’ve spent too much time mucking around with random libraries like ARPACK not working when you build Julia with MKL, so I also recommend sticking with OpenBLAS as the default.

Topic		Replies	Views
Calling Intel MKL as an external library on Mac General Usage ccall , mkl	2	1242	January 24, 2020
Problem when loading Intel MKL shared library to use with C written shared library General Usage ccall , mkl	9	3026	October 12, 2021
Segmentation fault when using ccall to call a BLAS-like extension function from Intel MKL General Usage ccall , mkl , segfault	7	114	December 3, 2024
Ccall performance Performance ccall , mkl	16	1066	October 24, 2020
Intel Mkl Windows building Internals & Design windows , blas , mkl , build	33	5931	October 7, 2019

Call Intel MKL as external library

Related topics