Ccall mkl shared library helmholtz solver : problems with gc() safety

ccall

#1

Hello julia users,

I am trying to ccall the 2D helmholtz solver from the Intel MKL shared library “mkl_rt” (I think it is fortran code) and need some help.
The Intel MKL solver is separated into four functions, which are to be called subsequently:

  1. void d_init_helmholtz_2d --------------> got that one working
    • Input: Ints, Floats and a String
    • Output: two Arrays: ipar, dpar
  2. void d_commit_helmholtz_2d ---------> having problems with gc() afterwards
    • Input: Some Arrays (including “f”) plus the two Arrays ipar and dpar from 1.
    • Output: ipar (altered), dpar (altered), f (altered), xhandle (unknown structure)
  3. void d_helmholtz_2d -------------------> dealing with this, when 2. works
  4. void free_helmholtz_2d ----------------> dealing with this, when 2. works

The commit function has a peculiarity currently not explained in the julia documentation for calling C and fortran code :
It returns an opaque pointer, not as the “main return variable” as described here but as a modifying input like it is described for float valued variables here. So it is like a combination of both.

With my best knowledge, I wrote some julia wrappers for the mentioned calls. They are pasted below and can be found on github aswell. When I paste the contents of “test.jl” in the REPL the “commit” call is processed without an error. But when I issue the garbage collector with gc() afterwards, julia crashes.

Can you tell me, what is wrong?

Thank you

Lars

Julia version:
julia> versioninfo()
Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
System: NT (x86_64-w64-mingw32)
CPU: Intel® Core™ i7-3770 CPU @ 3.40GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.7.1 (ORCJIT, ivybridge)

init_mkl.bat

REM initialize Intel MKL variables
REM otherwise an error arises: cannot find mkl_intel_thread.dll, even if it is in the PATH
REM adding run(`mklvars.bat intel64 lp64`) to the julia script does not work
@echo off
mklvars.bat intel64 lp64
exit 0

test.jl

# testcase for the helmholtz solver for the poisson case (q=0)

nx = 100;
ny = 126;

ax = 0.0;
bx = 1.0;
ay = 0.0;
by = 2.5;

bd_ax  = zeros(ny);
bd_bx  = zeros(ny);
bd_ay  = zeros(nx);
bd_by  = zeros(nx);

q      = 0.0;
BCtype = "DNNN";

# create vector for right hand side
f      = Array{Cdouble}(Int32((nx+1)*(ny+1)));
f[:]   = 0.0;

include("mkl_2d_helmholtz_solver.jl")

# INIT CALL
(ipar, dpar)            = d_init_helmholtz_2d(ax,bx,ay,by,Int32(nx),Int32(ny),BCtype,q);

# no problem with the garbace collector at this point
gc()

# COMMIT CALL
(ipar,dpar,f,xhandle)   = d_commit_helmholtz_2d(f,bd_ax,bd_bx,bd_ay,bd_by,ipar,dpar);

# run the garbace collector, this creates a crash right now
gc()

# SOLVE CALL
# the content of "f" is replaced by the desired solution
# get ipar and xhandle returned, because the memory is freed
# by a separate function in the shared library
(ipar,f,xhandle)   = d_helmholtz_2d(f,bd_ax,bd_bx,bd_ay,bd_by,ipar,dpar);

# FREE CALL
free_helmholtz_2d(xhandle,ipar);

mkl_2d_helmholtz_solver.jl

# MKL uses the integer type "MKL_INT", which can either be Int32 or Int64. Choose the appropriate here
MKL_INT = Int32;

# the commit routine returns an opaque pointer "xhandle" of type "DFTI_DESCRIPTOR_HANDLE".
# create a type definition for the type declaration in the ccall as described here: http://bit.ly/2pge11J
type DFTI_DESCRIPTOR_HANDLE
end

function d_init_helmholtz_2d(ax::Float64, 
                             bx::Float64,
                             ay::Float64,
                             by::Float64,
                             nx::MKL_INT,
                             ny::MKL_INT,
                             BCtype::String,
                             q::Float64)

    # convert from type "String" to a UInt8 Array
    in_BCtype    = Array{UInt8}(4);
    in_BCtype[1] = UInt8(BCtype[1]);
    in_BCtype[2] = UInt8(BCtype[2]);
    in_BCtype[3] = UInt8(BCtype[3]);
    in_BCtype[4] = UInt8(BCtype[4]);

    # stat is a variable, which value is changed by the fortran routine.
    # therefore, initialize it as a reference, as described in the julia docs:
    # http://bit.ly/2os7VpX
    stat         = Ref{MKL_INT}(0);

    # init integer array "ipar" and float array "dpar", which are used by 
    # subsequent routines and therefor returned by this function.
    ipar         = Array{MKL_INT}(128);
    ipar[:]      = 0;
    dpar         = Array{Float64}(Int32(5*nx/2+7));
    dpar[:]      = 0.0;

    # Everything is passed by reference, because otherwise it throws
    # invalid memory access exceptions or returns wrong results. 
    # The shared library seems to be written in fortran and not C.
    ccall((:D_INIT_HELMHOLTZ_2D, "mkl_rt"),  # function and library
        Ptr{Void},      # ReturnType Void --> no return value
        (Ref{Float64},  #ax
        Ref{Float64},   #bx
        Ref{Float64},   #ay      
        Ref{Float64},   #by      
        Ref{MKL_INT},   #nx      
        Ref{MKL_INT},   #ny      
        Ref{UInt8},     #BCtype
        Ref{Float64},   #q       
        Ref{MKL_INT},   #ipar
        Ref{Float64},   #dpar
        Ref{MKL_INT},   #stat
        ),
        ax,
        bx,
        ay,
        by,
        nx,
        ny,
        in_BCtype,
        q,
        ipar,
        dpar,
        stat,
        );

    # error handling of the returned status variable "stat"
    if stat[] != 0
        if stat[] == -99999
            error("INIT HELMHOLTZ 2D ERROR: The routine failed to complete the task because of a fatal error.");
        else
            error("INIT HELMHOLTZ 2D ERROR: Unknown error. stat=$(stat[]).");
        end
    end

    return (ipar, dpar)
end



function d_commit_helmholtz_2d(f::Array{Cdouble},
                               bd_ax::Array{Float64},
                               bd_bx::Array{Float64},
                               bd_ay::Array{Float64},
                               bd_by::Array{Float64},
                               ipar::Array{MKL_INT},
                               dpar::Array{Float64})

    stat       = Ref{MKL_INT}(0);
    xhandle    = DFTI_DESCRIPTOR_HANDLE();

    ccall((:D_COMMIT_HELMHOLTZ_2D, "mkl_rt"),
        Ptr{Void},                      # ReturnType Void --> no return value
        (Ref{Cdouble},                  #f
        Ref{Float64},                   #bd_ax
        Ref{Float64},                   #bd_bx
        Ref{Float64},                   #bd_ay
        Ref{Float64},                   #bd_by
        Ref{DFTI_DESCRIPTOR_HANDLE},    #xhandle    # unknown pointer to a dynamically allocated address
        Ref{MKL_INT},                   #ipar
        Ref{Float64},                   #dpar
        Ref{MKL_INT},                   #stat
        ),
        f,
        bd_ax,
        bd_bx,
        bd_ay,
        bd_by,
        xhandle,
        ipar,
        dpar,
        stat,
        );

    if stat[] != 0
        if stat[] == 1
            warning("COMMIT HELMHOLTZ 2D WARNING: Some warnings exist. See stdout.");
        elseif stat[] == -100
            error("COMMIT HELMHOLTZ 2D ERROR: The routine stopped because an error in the input data was found or the data in the dpar, spar, or ipar array was altered by mistake.");
        elseif stat[] == -1000
            error("COMMIT HELMHOLTZ 2D ERROR: The routine stopped because of the Intel MKL FFT or TT interface error.");
        elseif stat[] == -10000
            error("COMMIT HELMHOLTZ 2D ERROR: The routine stopped because the initialization failed to complete or the parameter ipar[0] was altered by mistake.");
        elseif stat[] == -99999
            error("COMMIT HELMHOLTZ 2D ERROR: The routine failed to complete the task because of a fatal error.");
        else
            error("COMMIT HELMHOLTZ 2D ERROR: Unknown error. stat=$(stat[]).");
        end
    end

    return (ipar,dpar,f,xhandle);
end


function d_helmholtz_2d(f::Array{Cdouble},
                        bd_ax::Array{Float64},
                        bd_bx::Array{Float64},
                        bd_ay::Array{Float64},
                        bd_by::Array{Float64},
                        xhandle::DFTI_DESCRIPTOR_HANDLE,
                        ipar::Array{MKL_INT},
                        dpar::Array{Float64})

    stat = Ref{MKL_INT}(0);

    ccall((:D_HELMHOLTZ_2D, "mkl_rt"),
        Ptr{Void},                      # ReturnType Void --> no return value
        (Ref{Cdouble},                  #f
        Ref{Float64},                   #bd_ax
        Ref{Float64},                   #bd_bx
        Ref{Float64},                   #bd_ay
        Ref{Float64},                   #bd_by
        Ref{DFTI_DESCRIPTOR_HANDLE},    #xhandle
        Ref{MKL_INT},                   #ipar
        Ref{Float64},                   #dpar
        Ref{MKL_INT},                   #stat
        ),
        f,
        bd_ax,
        bd_bx,
        bd_ay,
        bd_by,
        xhandle,
        ipar,
        dpar,
        stat,
        );

    if stat[] != 0
        if stat[] == 1
            warning("SOLVE HELMHOLTZ 2D WARNING: Some warnings exist. See stdout.");
        elseif stat[] == -2
            error("SOLVE HELMHOLTZ 2D ERROR: The routine stopped because division by zero occurred. It usually happens if the data in the dpar or spar array was altered by mistake.");
        elseif stat[] == -3
            error("SOLVE HELMHOLTZ 2D ERROR: The routine stopped because the sufficient memory was unavailable for the computations.");
        elseif stat[] == -100
            error("SOLVE HELMHOLTZ 2D ERROR: The routine stopped because an error in the input data was found or the data in the dpar, spar, or ipar array was altered by mistake.");
        elseif stat[] == -1000
            error("SOLVE HELMHOLTZ 2D ERROR: The routine stopped because of the Intel MKL FFT or TT interface error.");
        elseif stat[] == -10000
            error("SOLVE HELMHOLTZ 2D ERROR: The routine stopped because the initialization failed to complete or the parameter ipar[0] was altered by mistake.");
        elseif stat[] == -99999
            error("SOLVE HELMHOLTZ 2D ERROR: The routine failed to complete the task because of a fatal error.");
        else
            error("SOLVE HELMHOLTZ 2D ERROR: Unknown error. stat=$(stat[]).");
        end
    end

    return (ipar,f,xhandle);
end

function free_helmholtz_2d(xhandle::DFTI_DESCRIPTOR_HANDLE,ipar::Array{MKL_INT})

    stat = Ref{MKL_INT}(0);

    ccall((:FREE_HELMHOLTZ_2D, "mkl_rt"),
        Ptr{Void},                      # ReturnType Void --> no return value
        (Ref{DFTI_DESCRIPTOR_HANDLE},   #xhandle
        Ref{MKL_INT},                   #ipar
        Ref{MKL_INT},                   #stat
        ),
        xhandle,
        ipar,
        stat,
        );

    if stat[] != 0
        if stat[] == -1000
            error("FREE HELMHOLTZ 2D ERROR: The routine stopped because of an Intel MKL FFT or TT interface error.");
        elseif stat[] == -99999
            error("FREE HELMHOLTZ 2D ERROR: The routine failed to complete the task because of a fatal error.");
        else
            error("FREE HELMHOLTZ 2D ERROR: Unknown error. stat=$(stat[]).");
        end
    end
end

#2

I think the handle is getting lost, but you’re probably close. Treat xhandle as a referend (like stat) in the “commit” function. That is,

rxh = Ref{DFTI_DESCRIPTOR_HANDLE}(0)
ccall( (:D_COMMIT_...),
  ...
  rxh,
  ...
)
...
xhandle = rxh[]

and you may also need to change to something like

const DFTI_DESCRIPTOR_HANDLE = Int64

(based on the typical C interface for MKL routines). Also, check the size of dpar - the docs for the version of MKL I have call for 13 nx / 2 + 7 elements.


#3

Thank you very much!

Your solution solves the mentioned issue. The wrappers are not crashing anymore.

The solutions I get now are incorrect, because I got mixed up with the 1- and 0-based indexing. Once everything is validated, I will update the files on github and paste them here for reference.

Cheers!