Segfault calling C function, any advice?

Hey all,

I’m trying to call the libsais_int function from libsais, with signature:
int32_t libsais_int(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs);

I wrote the following (and also tried using pointer() but I cannot seem to get it to work:

const LIBSAIS = "libsais.so.2"
function create_suffix_array(in_vector::Vector{Int32}, free_space::Int)
    out_vector = zeros(Int32, length(in_vector) + free_space)
    n = length(in_vector)
    k = length(Set(in_vector))
    res = ccall((:libsais_int,LIBSAIS), Cint, (Ref{Vector{Int32}}, Ref{Vector{Int32}}, Cint, Cint, Cint), in_vector, out_vector, n, k, free_space) 
    return out_vector
end
println(create_suffix_array(Int32[2,1,1,1,10], 10000)) 

This will: signal (11): Segmentation fault

Any tips :slight_smile:

You probably need to GC.@preserve in_vector. Otherwise it can get freed by the garbage collector.

I tried this before:

function create_suffix_array(in_vector::Vector{Int32}, free_space::Int)
    out_vector = zeros(Int32, length(in_vector) + free_space)
        GC.@preserve in_vector out_vector begin
        n = length(in_vector)
        k = length(Set(in_vector))
        res = ccall((:libsais_int,LIBSAIS), Cint, (Ref{Vector{Int32}}, Ref{Vector{Int32}}, Cint, Cint, Cint), Ref(in_vector), Ref(out_vector), n, k, free_space) 
    end
    return out_vector
end

That also segfaults, or should I put the GC preserve there in a different way?

Try

@ccall LIBSAIS.libsais_int(in_vector::Ptr{Int32}, out_vector::Ptr{Int32}, n::Int32, k::Int32, free_space::Int32)::Int32

That seems to work, I do get a segfault with other input data but maybe that relates to the the library (altho it should be possible it says in the docs):

const LIBSAIS = "/home/rickb/tools/suffix/libsais/libsais.so.2"
function create_suffix_array(in_vector::Vector{Int32}, free_space::Int)
    out_vector = zeros(Int32, length(in_vector) + free_space)
    n = length(in_vector)
    k = length(Set(in_vector))
    @ccall LIBSAIS.libsais_int(in_vector::Ptr{Int32}, out_vector::Ptr{Int32}, n::Int32, k::Int32, free_space::Int32)::Int32
    return out_vector
end

println(create_suffix_array(Int32[2,1,1,1,1458521], 10))       # works
println(create_suffix_array(Int32[2,1,1,1,1458521], 1000))   # segfaults
* @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance).

You think it’s related to the Julia call?

Hard to tell. You can try with an equivalent C code and see if that works. When it doesn’t crash, do you get the results you’d expect?

1 Like

Yeah, the call seems fine now (the first result is what I expect thanks!), but now the second confuses me haha. Lets see if I can code C haha

I think it should be:

k = maximum(in_vector)+1

As the parameter is the alphabet size (which includes missing letters). So essentially, the maximum value in the input array.
Specifically, if input is [1,5,2,2] is should be at least 5+1 = 6 and not 3 which is the length(Set([1,5,2,2])). The +1 on the maximum is because 0 is a an alphabet value too.

As, the maximum value in the example is rather big, it might be productive to first reduce the maximum alphabet value with a lookup table as follows:

v = Int32[2,1,1,1,1458521]
d = Dict(Iterators.map(reverse,pairs(sort(unique(v)))))
newv = Int32.(get.(Ref(d), v, 0))

Now, the maximum(newv)+1 should be minimal for the input vector, and ready to be shipped to libsais_int. Since the ordering of inputs are preserved, the output suffix array is also identical.

1 Like