Segfault using ccall on initialization function

I am trying to wrap the TBLIS library (GitHub - devinamatthews/tblis: TBLIS is a library and framework for performing tensor operations, especially tensor contraction, using efficient native algorithms.). I have successfully worked with the tblis_matrix types, but I am getting segmentation faults when trying to initialize tblis_tensor types. Here is a self contained MRE, except for libtblis.so which you can download for x86_64 linux from TBLIS.jl (GitHub - mdav2/TBLIS.jl: Julia wrapper for TBLIS tensor contraction library.)

using Libdl
using LinearAlgebra
    
global tblis = dlopen("libtblis.so")
    
struct tblis_tensor{T}                                              
    type::Int32
    conj::Int32                                                     
    attr::ComplexF64                                                
    data::Array{T}
    ndim::UInt32
    len::Vector{Int}                                                
    stride::Vector{Int}                                             
end 

function tblis_tensor{T}(D) where T <: AbstractFloat                
    strides = [1]
    lens = collect(size(D)) .+ 1                                    
    for (i,v) in enumerate(lens[1:end-1])                           
        push!(strides,v*strides[i])
    end          
    n::UInt32 = length(size(D))                                     
    M = tblis_tensor{T}(zero(Int32),zero(Int32),                    
                        0.0 + 0.0im,                                
                        D,                                          
                        n,
                        Vector(lens),                               
                        Vector(strides))
    if T == Float32                                                 
        tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_s)       
    elseif T == Float64                                             
        tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_d)
    else
        error("Type $T is not supported by TBLIS :(")
    end  
    ccall(tblis_init_tensor,Cvoid,(Ref{tblis_tensor{T}},Cuint,Ref{Int},Ref{T},
                                   Ref{Int}),                       
          Ref(M),n,lens,D,strides)                                  
                                                                    
    return M
end 
    
mytensor = tblis_tensor{Float64}(rand(2,2))
println(mytensor)

I will note that the segfault does not actually occur in the ccall itself, but at the next GC step or if you try to print the tblis_tensor object.

It’s not easy for me to search the c code now but this declaration is almost certainly wrong u less all these fields are just what you use in Julia code and not part of the c struct. The code code have no idea how to operate on Julia vector objects.

So Vector → Array?

Here’s the C struct from tblis/src/util/basic_types.h:

typedef struct tblis_tensor
{
    type_t type;
    int conj;
    tblis_scalar scalar;
    void* data;
    unsigned ndim;
    len_type* len;
    stride_type* stride;
[bunch of methods]
}

No those fields are pointers and you need to leave them as that (Ptr) when passed to c.

1 Like

EDIT: Where are my manners, thank you!

Here’s my final (working?) code. It seems to work, but I’m wary of using pointer …

using Libdl
using LinearAlgebra
    
global tblis = dlopen("libtblis.so")
    
struct tblis_tensor{T}                                              
    type::Int32
    conj::Int32                                                     
    attr::ComplexF64                                                
    data::Ptr{T}
    ndim::UInt32
    len::Ptr{Int}
    stride::Ptr{Int}
end 

function tblis_tensor{T}(D) where T <: AbstractFloat                
    strides = [1]
    lens = collect(size(D)) .+ 1                                    
    for (i,v) in enumerate(lens[1:end-1])                           
        push!(strides,v*strides[i])
    end          
    n::UInt32 = length(size(D))                                     
    M = tblis_tensor{T}(zero(Int32),zero(Int32),                    
                        0.0 + 0.0im,                                
                        pointer(D),                                 
                        n,
                        pointer(lens),
                        pointer(strides))
    if T == Float32                                                 
        tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_s)       
    elseif T == Float64                                             
        tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_d)
    else
        error("Type $T is not supported by TBLIS :(")
    end  
    ccall(tblis_init_tensor,Cvoid,(Ref{tblis_tensor{T}},Cuint,Ref{Int},Ref{T},
                                   Ref{Int}),                       
          Ref(M),n,lens,D,strides)                                  
                                                                    
    return M
end 
    
mytensor = tblis_tensor{Float64}(rand(2,2))
println(mytensor)

You also must keep all the values you get the pointers from valid during the call. Either using appropriately defined cconvert/unsafe_convert or using GC preserve. You also need to either add extra fields after the c members or just construct the c struct on the fly since the object you return to the user must hold reference to the actual objects.

1 Like

Hmm. I’m not sure I understand. Do you mean something like this? And where do I put the GC.@preserve?

using Libdl
using LinearAlgebra

global protecc = Array{Ref,1}(undef,0)

global tblis = dlopen("libtblis.so")

struct tblis_tensor{T}
    type::Int32
    conj::Int32
    attr::ComplexF64
    _data::Ptr{T}
    ndim::UInt32
    _len::Ptr{Int}
    _stride::Ptr{Int}
    data::Array{T}
    len::Array{Int}
    stride::Array{Int}

end

function tblis_tensor{T}(D) where T <: AbstractFloat
    strides = [1]
    lens = collect(size(D))
    for (i,v) in enumerate(lens[1:end-1])
        push!(strides,v*strides[i])
    end
    n::UInt32 = length(size(D))
    GC.@preserve begin
        M = tblis_tensor{T}(zero(Int32),zero(Int32),
                            0.0 + 0.0im,
                            pointer(D),
                            n,
                            pointer(lens),
                            pointer(strides),
                            D,
                            lens,
                            strides)
        if T == Float32
            tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_s)
        elseif T == Float64
            tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_d)
        else
            error("Type $T is not supported by TBLIS :(")
        end
        ccall(tblis_init_tensor,Cvoid,(Ref{tblis_tensor{T}},Cuint,Ref{Cptrdiff_t},Ptr{T},
                                       Ref{Cptrdiff_t}),
              Ref(M),n,lens,D,strides)
    end

    return M
end

A = tblis_tensor{Float32}(rand(Float32,2,2,2))
B = tblis_tensor{Float32}(rand(Float32,2,2,2))
C = tblis_tensor{Float32}(rand(Float32,2,2,2,2))
println(A)

See the document of the macro, you need to actually preserve something. The objects you need to preserve are the arrays so just preserving those is also enough. Preserving M is exactly equivalent in this case and would be shorter to type.

Note that this Ref has no effect.
Also note that while the pointers you stored in M, and therefore the backing arrays, will be written to, none of the fields of M will be mutated after the ccall.

1 Like

Thanks very much for all the help. Do you mean it should look like this? I’ve also created a wrapper struct TTensor to maintain a reference to the data in tblis_tensor

struct tblis_tensor{T}
    type::Int32
    conj::Int32
    attr::ComplexF64
    _data::Ptr{T}
    ndim::UInt32
    _len::Ptr{Int}
    _stride::Ptr{Int}
    #data::Array{T}
    #len::Array{Int}
    #stride::Array{Int}
end
mutable struct TTensor{T}
    tensor::tblis_tensor{T}
    data::Array{T}
    len::Array{Int}
    stride::Array{Int}
end

function TTensor{T}(D) where T <: AbstractFloat
    strides = [1]
    lens = collect(size(D))
    for (i,v) in enumerate(lens[1:end-1])
        push!(strides,v*strides[i])
    end
    #push!(protecc,Ref(lens))
    #push!(protecc,Ref(strides))
    #push!(protecc,Ref(D))
    n::UInt32 = length(size(D))
    GC.@preserve D lens strides begin
        M = tblis_tensor{T}(zero(Int32),zero(Int32),
                            0.0 + 0.0im,
                            pointer(D),
                            n,
                            pointer(lens),
                            pointer(strides))
                            #D,
                            #lens,
                            #strides)
        if T == Float32
            tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_s)
        elseif T == Float64
            tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_d)
        else
            error("Type $T is not supported by TBLIS :(")
        end
        ccall(tblis_init_tensor,Cvoid,(Ref{tblis_tensor{T}},Cuint,Ptr{Int},Ptr{T},
                                       Ptr{Int}),
              M,n,lens,D,strides)
        _M = TTensor{T}(M,D,lens,strides)
    end

    return _M
end

With that construction, I am able to create and modify these structs, thank you. However, I am now receiving a segfault when trying to do basic addition operations with them. Here’s my wrapper for tblis_tensor_add, do you see anything wrong with it? I’ve also attached the C function.

function _add(A::T,B::T,idx_A::String,idx_B::String) where T <: TTensor{T2} where T2 <: AbstractFloat
    tblis_tadd = dlsym(tblis,:tblis_tensor_add)
    ccall(tblis_tadd,Cvoid,(Ptr{Nothing},Ptr{Nothing},
                            Ptr{tblis_tensor{T2}},Cstring,
                            Ptr{tblis_tensor{T2}},Cstring),
          C_NULL,C_NULL,
          A.tensor,idx_A,
          B.tensor,idx_B)
end
void tblis_tensor_add(const tblis_comm* comm, const tblis_config* cfg,
                      const tblis_tensor* A, const label_type* idx_A_,
                            tblis_tensor* B, const label_type* idx_B_);

again, thank you so much for your help. I think I might be starting to get this, but I do need a little more assistance haha.

M will not get modified by the ccall. If there’s any mutation that you want to capture, you have access to something that’s NOT a immutable object. With minimum change to this version of the code, you can do

rM = Ref(M)
ccall(..., (...,), rM, n, ...)
_M = TTensor{T}(rM[], ...)

Note that as I said, the Ref does NOT change what the ccall does and M won’t be mutated in either case. However, Ref(M) is mutated in both cases and you need to access it after the ccall to see the update value. (Again, this is only needed if the struct is actually being written to by the C code).

It can be simplified though. You don’t really need the Ref since you already have a perfectly fine mutable structore declared to hold the C structure, i.e. TTensor, all what you need is to pass the pointer within TTensor to C. The pointer is compatible in this case so you just need to pass the TTensor to C. This won’t make too much difference here but the same pattern should be used to avoid doing any copying when you use this later. By passing the TTensor to C directly, you can also use the builtin mechanism to keep the TTensor, and therefore all its fields, alive and you don’t need to define you own anymore. In all, it can be simplified to (untested).

function TTensor{T}(D) where T <: AbstractFloat
    strides = [1]
    lens = collect(size(D))
    for (i,v) in enumerate(lens[1:end-1])
        push!(strides,v*strides[i])
    end
    n::UInt32 = length(size(D))
    M = TTensor{T}(tblis_tensor{T}(zero(Int32),zero(Int32),
                                   0.0 + 0.0im,
                                   pointer(D),
                                   n,
                                   pointer(lens),
                                   pointer(strides)), D, lens, strides)
    if T == Float32
        tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_s)
    elseif T == Float64
        tblis_init_tensor = dlsym(tblis,:tblis_init_tensor_d)
    else
        error("Type $T is not supported by TBLIS :(")
    end
    ccall(tblis_init_tensor, Cvoid, (Ref{TTensor{T}},Cuint,Ptr{Int},Ptr{T}, Ptr{Int}),
          M, n, lens, D, strides)
    return M
end

Note that there’s no GC.@preserve needed because the only place that uses the return values of the pointer is in the ccall. If you access the memory pointed to by, say, M.tensor._data anywhere in this function (including before the ccall), it must be enclosed in GC.@preserve M (or just for the corresponding array of course).


A few other notes,

Unless there’s good reason you have to use dlsym, you should just do a branch and have two different ccalls. If you have to use dlsym, you should cache the pointer. See PyCall for example of that.

This is gone in the new version but you can just write return TTensor{T}(M,D,lens,strides) (or even omit the return if that’s your style). Returning from a GC.@preserve block is totally safe.


I don’t think you are actually modifying them (the type, conj, attr and ndim fields for example) so that could be part of it.

I don’t think this will work, in that it should throw an error before you make the ccall. A Ref{tblis_tensor{T2}} should surpress the error but isn’t the right thing to do either. As I said, you want to pass a pointer to the C struct from your TTensor. A.tensor will not do that. The returned immutuable object from A.tensor has absolutely nothing to do with the memory of A from that point on. You need a pointer to the field and again, in this case it’s simply the pointer to the TTensor itself so you just need.

    ccall(tblis_tadd,Cvoid,(Ptr{Nothing},Ptr{Nothing},
                            Ref{TTensor{T2}},Cstring,
                            Ref{TTensor{T2}},Cstring),
          C_NULL,C_NULL,
          A,idx_A,
          B,idx_B)

Also note that,

does not keep either A, or the array stored in its field alive. If you are going this route (A is a const pointer so I assume the C code isn’t mutatint it and passing in a “copy” could be fine) you must manually perserve the A during the ccall or whatever period you are using those pointers.

2 Likes

Thank you very much for the complete response. I believe I understand it now. I’ll try out your suggestions tomorrow.