Convert Vector{String} to Ptr{Ptr{UInt8}}?

I’m aware that Julia strings are unicode things, but if I make sure that the strings are ASCII clean, I can figure out how to convert Vector{String} → Vector{Vector{UInt8}}, but not to Ptr{Ptr{UInt8}} …

In [1]: a = ["hi", "there"]
In [2]: b = map(x -> Vector{UInt8}(x), a)
In [3]: typeof(b)
Out[3]: Vector{Vector{UInt8}} (alias for Array{Array{UInt8, 1}, 1})

The function that I am trying to call:

frylock@laptop:~/Projects/Julia_C/$ nm -D --defined-only --demangle libstuff.so
0000000000001139 T stuff(char**)

The signature alone is not enough, you have to know whether C expects an array of pointers, or a pointer to an array of chars (which in C are both char**). You will have to create a new array; you cannot cleanly convert a Vector{Vector{String}} to a char**, because the representations of the two aren’t a 1:1 mapping (not even on the bit level).

The function with the stuff(char **) signature is supposed to be a wrapper for a library function that expects std::vector<std::string>. Perhaps I should have mentioned that in case I was proposing an X Y problem.

So the larger issue is that I need to get a list of strings from Julia-land to a C++ function that wants std::vector<std::string>.

Probably you want something like:

function stuff(strings::AbstractVector{String})
    GC.@preserve strings begin
        ptrs = Cstring[Base.unsafe_convert(Cstring, s) for s in strings] # array of pointers to char
        @ccall stuff(ptrs::Ptr{Cstring})::Cvoid # or whatever your type signature is
    end
end

So that you allocate an array of Cstring (equivalent to an array of char*, and ensured to be NUL-terminated for C string compatibility) to pass to your C function (Ptr{Cstring} acts like char**). GC.@preserve is used to ensure that the strings array is not garbage collected (is “rooted”) while the ccall executes.

This is usually not a concern. Unicode can still be passed as a char* (e.g. as data for std::string), it is just UTF-8 encoded. To the extent that your application actually cares about the contents of the strings (as opposed to just treating strings as atoms, e.g. filenames), of course, it may need to be UTF8-aware (but not always, e.g. if it is just looking for ASCII substrings).

You usually need some kind of C wrapper for this, either written manually or ala CxxWrap.jl.

1 Like

And in case this is useful for someone in the future:

My test C++ code
#include <iostream>
#include <vector>

typedef struct {
    size_t size;
    char **data;
} _StringList;


#ifdef __cplusplus
extern "C" {
#endif

void stuff(_StringList sl) {
    for (size_t i = 0; i < sl.size; ++i)
        std::cout << sl.data[i] << std::endl;
}

#ifdef __cplusplus
} // extern "C"
#endif

EDIT: applied @stevengj 's suggested edit.

My test Julia script
struct StringList
    size::Csize_t
    data::Ptr{Cstring}
end

a = ["hi", "there"]

GC.@preserve a begin
    t1 = Cstring[Base.unsafe_convert(Cstring, s) for s in a]
    GC.@preserve a begin
        t2 = Base.unsafe_convert(Ptr{Cstring}, t1)
        s = StringList(length(a), t2)
        @show @ccall "./libstuff.so".stuff(s::StringList)::Cvoid
    end
end

This is unsafe since t1 could get garbage collected before you are done with it. You need something like:

GC.@preserve a begin
    t1 = Cstring[Base.unsafe_convert(Cstring, s) for s in a]
    GC.@preserve t1 begin
        s = StringList(length(a), pointer(t1))
        @show @ccall "./libstuff.so".stuff(s::StringList)::Cvoid
    end
end
1 Like