Passing an array of structures through `ccall`

question

#1

I’m trying to pass an array of structure (actually, a pointer to the chunk of memory containing the first structure) from Julia to C through ccall.

An example of what I’m trying to do is shown below, using just C:

#include "stdio.h"
#include "string.h"
#include "stdlib.h"

//Define a test structure
struct st_test {
    int i;
    double d;
    char s[10];
    double a[2];
    int b[2];
};


//Print the contents of an array of structure, and modify their values
//upon return.
int printAndModify(struct st_test* pointer, int size) {
    int i;
    printf("\nin C:\n");

    //Loop through elements in the array
    for (i=0; i<size; i++) {

        //Pointer to current element
        struct st_test* cur = &(pointer[i]);

        //Print structure contents
        printf("index %d/%d, memory address %p\n", i+1, size, cur);
        printf("  i=%d\n  d=%f\n  s=%10s\n  a=%f %f\n  b=%d %d\n",
               cur->i, cur->d, cur->s,
               cur->a[0], cur->a[1], cur->b[0], cur->b[1]);

        //Modify structure contents
        cur->i = 100 + i;
        cur->d += 100;
        strcpy(cur->s, "from C   ");
        cur->a[0] *= 3.;
        cur->a[1] *= 4.;
        cur->b[0] *= 5.;
        cur->b[1] *= 6.;
    }
    return 1;
}


void test_c() {
    //Allocate memory for the array of structures. NOTE: all
    //structures are stored in CONTIGUOUS chunks!
    const int size = 2;
    struct st_test* array = malloc(sizeof(struct st_test) * size);

    //Fill values
    int i;
    for (i=0; i<size; i++) {
        //Pointer to current structure
        struct st_test* cur = &(array[i]);

        cur->i = 1;
        cur->d = 2;
        strcpy(cur->s, "from test_c");
        cur->a[0] = 3;
        cur->a[1] = 4;
        cur->b[0] = 5;
        cur->b[1] = 6;
    }

    printAndModify(array, size);
    free(array);
}

After compiling the above code as a shared library, I wanted to implement the test_c routine in Julia, as follows:

#Define a julia structure corresponding to the C one
mutable struct st_test
  i::Cint
  d::Cdouble
  s::NTuple{10, Cchar}
  a::NTuple{2, Cdouble}
  b::NTuple{2, Cint}
end

#Simple constructor for the structure 
st_test() = st_test(1, 2., NTuple{10,Cchar}("from Julia"), (3.,4.), (5, 6))

#Create an array of structures
array = [st_test() for i=1:2]

#Size of the array
size = Cint(length(array))

ret = ccall((:printAndModify, "./libarrayIssue.so"), Cint, (Ptr{st_test}, Cint), 
            array, size)

#Print structure contents
@printf("\nin Julia:\n");
for i in 1:size
    cur = array[i]
    @printf("index %d/%d\n", i, size);
    @printf("  i=%d\n  d=%f\n  s=%10s\n  a=%f %f\n  b=%d %d\n",
		   cur.i, cur.d, join(Char.(cur.s), ""),
		   cur.a[1], cur.a[2], cur.b[1], cur.b[2]);
end

However, this Julia code ends up in a segfault since (as far as I understood) the structures passed through ccall are not allocated in contiguous chunks of memory, while in C they were one next to the other.

The only way I found to solve the problem is to create a new version of the printAndModify routine, called printAndModify_new:

int printAndModify_new(struct st_test** pointer, int size) {
    //                   CHANGED HERE ^
    int i;
    printf("\nin C:\n");

    //Loop through elements in the array
    for (i=0; i<size; i++) {

        //Pointer to current element
        struct st_test* cur = pointer[i]; // <-- CHANGED HERE

        //Print structure contents
        printf("index %d/%d, memory address %p\n", i+1, size, cur);
        printf("  i=%d\n  d=%f\n  s=%10s\n  a=%f %f\n  b=%d %d\n",
               cur->i, cur->d, cur->s,
               cur->a[0], cur->a[1], cur->b[0], cur->b[1]);

        //Modify structure contents
        cur->i = 100 + i;
        cur->d += 100;
        strcpy(cur->s, "from C   ");
        cur->a[0] *= 3.;
        cur->a[1] *= 4.;
        cur->b[0] *= 5.;
        cur->b[1] *= 6.;
    }
    return 1;
}

which I call from Julia with:

ret = ccall((:printAndModify_new, "./libarrayIssue.so"), Cint, (Ptr{st_test}, Cint), 
            array, size)

Notice the change in the input parameter: now it is a st_test**.

This is a significantly simplified version of a problem I’m having in calling a C library, and I’m not going to change all the st_test* to st_test** in the code.

Is there anyway I can pass an Array{st_test,1} through ccall retaining the original function definition with st_test*, i.e. by having the structures stored in continguous chunks of memory?


Passing an array of structures through `ccall`, clarification
#2

I think you can get what you want if st_test is an immutable struct in Julia.


#3

As someone took my answer 5 seconds before I was gonna post it :P, just want to confirm that using a normal struct works:

index 1/2
  i=100
  d=102.000000
  s= from C   
  a=9.000000 16.000000
  b=25 36
index 2/2
  i=101
  d=102.000000
  s= from C   
  a=9.000000 16.000000
  b=25 36

#4

Sorry, I forgot to mention… I need st_test to be mutable, i.e. I need to do something like array[1].i = 4.
So the (updated) question is now:

Is there anyway I can pass an Array{st_test,1} through ccall retaining the original function definition with st_test* (i.e. by having the structures stored in continguous chunks of memory) and having st_test declared as mutable ?


#5

I don’t believe it is possible to have st_test mutable and have it stored contiguous in memory in an array.

But is mutability of the array elements actually important? Or is it ok to do something like the following when you need to update an element of the list:

old_element = array[1]
new_element = st_test(4, old_element.d, old_element.s, old_element.a, old_element.b)
array[1] = new_element

You can hide this messiness behind a helper function for now, but I believe there is syntax on the horizon that will make this less annoying

https://github.com/JuliaLang/julia/pull/21912


#6

Well, this is going to be quite tricky, while the new syntax may definitely be a solution.

Until the question is settled, I will resort to simply copying the data into contiguous chunks, and then back to Julia’s array, using the following C code:

void* copyArrayData(void** src, ssize_t size, ssize_t len) {
    void* out = malloc(size * len);
    void* dst = out;

    ssize_t i;
    for (i=0; i<len; i++) {
        printf("Copying data: %p -> %p\n", src[i], dst);
        memcpy(dst, src[i], size);
        dst += size;
    }

    return out;
}

void restoreArrayData(void* src, void** dst, ssize_t size, ssize_t len) {
    void* toBeFreed = src;
    ssize_t i;
    for (i=0; i<len; i++) {
        printf("Copying data: %p <- %p\n", dst[i], src);
        memcpy(dst[i], src, size);
        src += size;
    }
    free(toBeFreed);
}

The Julia code will be:

tmp = ccall((:copyArrayData, "./libarrayIssue.so"), Ptr{Void}, (Ptr{Void}, Csize_t, Csize_t), 
            array, sizeof(st_test), size)

ret = ccall((:printAndModify, "./libarrayIssue.so"), Cint, (Ptr{st_test}, Cint), 
            tmp, size)

ccall((:restoreArrayData, "./libarrayIssue.so"), Void, (Ptr{Void}, Ptr{Void}, Csize_t, Csize_t), 
      tmp, array, sizeof(st_test), size)

This is far from being elegant and has poor performances, but is currently the best solution (IMHO).

Do you think these routines should be generally available in Julia?
I believe my problem is quite common among people who wish to use C libraries from Julia, and having them well documented could save hours of struggling…


#7

No. the correct solution should.


#8

Agree!
…but what is the correct solution ? :thinking:


#9

#10

Sorry, I’m a bit new to Julia and maybe I miss the point…

The following code ultimately results in some sort of copy between memory locations, isn’t it?

Also the new syntax (immutable_struct@field = value) is just syntactic sugar to perform the copy proposed by @Michael_Eastwood, right?

Moreover those methods are either tricky (Julia is supposed to be a high level language…) or are not yet implemented. Then why these approaches are better than the one based on C’s memcpy ?

Clearly there are performance tradeoffs related to how many times the array is modified and how large is the structure, which makes one of the above methods preferable over the others, but there is no clear winner a-priori.

Moreover the performance flaw is given by our will to call C code which (at design time) was not supposed to be called by Julia, not by Julia itself.

Finally, the copyArrayData / restoreArrayData method implemented above is:

  • completely generic (based on void* pointers);
  • very easy to implement and to discuss in the documentation;
  • specifically targeted to solve the problem of calling C/Fortran code which requires the structures to be stored contiguously in memory.

One final question: are we sure that in all future Julia implementations an Array of immutable structures will always be stored as contiguous chunks in memory?
If not, the memcpy approach proposed here is the only way to go…


#11

Yes

No it mustn’t.

I don’t see why not yet implemented means worse. I didn’t say you couldn’t use the less efficient one, just that we won’t have it in Base and will never recommend it.

No, it can miss write barriers for the really generic case.

Sure, doesn’t make it good.

And this doesn’t make it the correct solution either.

No that’s exactly the case using memcpy is wrong.


#12

Wow, it looks like I have a lot to learn… :wink:

By the way:

well, this depends on the point of view: You (I argue) are a Julia developer, I’m a user.

Currently any user can easily produce strange or unwanted behaviors, even segfaults, using ccall. Users know it is something to be used with great care.

However what you mentioned is exactly the reason why a Julia implementation would be desirable: it could help users avoid doing the wrong thing.

I’ll considered the question closed, the answer being three-fold:

  • implement the solution kindly suggested by @Michael_Eastwood;
  • wait for the new syntax to be implemented;
  • implement a simple C code which does the hard work and take your risks.

I’ll go for the third… :wink:


#13

I mean that certainly mean you can’t use it for now, but it doesn’t make it a worse long term solution in base.

No, not when used correctly.

That’s why we want a correct julia implimentation. FWIW, for non-inlined immutable, it is impossible to implement a version without the C code being aware of julia.


#14

A small update:

I implemented a (hopefully useful and general enough) solution following @Michael_Eastwood suggestion which works as follows: the user define a mutable structure, then calls a function (named defineImmutableStruct, see below) which automatically defines:

  • a similar (same fields and types) immutable structure, with name prefix immutable_;
  • an inner constructor for the immutable structure accepting as input an instance of the mutable one;
  • an outer constructor for the mutable structure accepting as input an instance of the immutable one;

The code is:

function defineImmutableStruct(stype)
    fsymb = fieldnames(stype)
    fname = convert.(String, fsymb)
    ftype = Vector{String}()
    map(fieldtype.(stype, fsymb)) do x
        push!(ftype, @sprintf "%s" x)
    end

    name = @sprintf "%s" stype

    # Get rid of module prefix (if any)
    name = split(name, ".")
    name = String(name[end])

    out = fname .* "::" .* ftype
    out = "struct immutable_" * name * "\n" * join("  " .* fname .* "::" .* ftype, "\n")
    out *= "\n\n"
    out *= "  immutable_" * name * "(p::" * name * ") = new(" * join("p." .* fname, ", ") * ")"
    out *= "\n"
    out *= "end"
    out *= "\n"
    #print(out)
    eval(parse(out))

    out = name * "(p::immutable_" * name * ") = " * name * "(" * join("p." .* fname, ", ") * ")\n"
    #print(out)
    eval(parse(out))
end

# Define a mutable structure
mutable struct st_test
  i::Cint
  d::Cdouble
  s::NTuple{10, Cchar}
  a::NTuple{2, Cdouble}
  b::NTuple{2, Cint}
end

#Simple constructor for the structure
st_test() = st_test(1, 2., NTuple{10,Cchar}("from Julia"), (3.,4.), (5, 6))

# Define the corresponding immutable structure and the constructors to switch back and forth:
defineImmutableStruct(st_test)

A call to whos() shows that we have two structure definitions (clearly with the same size):

    immutable_st_test    160 bytes  DataType
              st_test    160 bytes  DataType

Now let’s create an array of mutable structures, fill the values and convert it to an array of immutable ones:

array = [st_test() for i=1:2]
array[1].i = 10
array[2].i = 20
immutable_array = map(immutable_st_test, array)

We can safely pass the immutable_array through ccall since the structures are now stored in contiguous chunks. Upon return we can copy the (possibly modified) data from the immutable_array into an array of mutable structures:

new_array = map(st_test, immutable_array)

This solution is still not-so-elegant as I hoped, but is quick, effective and do not require C code. Also, the performances are similar to the copyArrayData / restoreArrayData approach described above.

For a better solution we should wait for the (immutable_struct@field = value) syntax…


#15

Oo, this is the perfect opportunity to bust out a macro. The following macro defines a mutable struct to pair with all of your immutable structs. I think this solution is a little cleaner

julia> macro define_mutable(immutable_struct)
           immutable_name = immutable_struct.args[2]
           mutable_name = Symbol(immutable_name, "_mutable")

           mutable_struct = copy(immutable_struct)
           mutable_struct.args[1] = true # set the mutability to true
           mutable_struct.args[2] = mutable_name

           constructors = quote
               function $mutable_name(x::$immutable_name)
                   $mutable_name(ntuple(i->getfield(x, i), nfields($immutable_name))...)
               end
               function $immutable_name(x::$mutable_name)
                   $immutable_name(ntuple(i->getfield(x, i), nfields($mutable_name))...)
               end
           end

           output = Expr(:block, immutable_struct, mutable_struct, constructors)
           esc(output)
       end
@define_mutable (macro with 1 method)

julia> @define_mutable struct Foo
           bar::Int
           baz::Float64
           barbaz
       end
Foo

julia> foo = Foo(1, 1.0, "hello")
Foo(1, 1.0, "hello")

julia> foo_mutable = Foo_mutable(foo)
Foo_mutable(1, 1.0, "hello")

julia> foo_mutable.barbaz = π
π = 3.1415926535897...

julia> Foo(foo_mutable)
Foo(1, 1.0, π = 3.1415926535897...)

#16

Great!!
This is much better, thank you very much!


#17

I should perhaps point out why incomplete but usable implementations of useful things cannot generally go into the language’s standard library: if we have it, then it must work in all circumstances. Once we’ve added a feature – especially once 1.0 comes out, which we’re working day and night towards – then it must be supported for the duration (which may be 5-10 years). If a user-defined work-around (hack, macro, whatever) happens to work for your use case, great, by all means use it! But features in the language don’t have the luxury of cherry-picking use cases that happen to work – they need to support all possible use cases.