Working towards better C interop

I recently wrapped a C library in Julia and am thoroughly impressed with how easy the process is compared to any other language I’ve worked with. That being said, there are a few pain points that, as far as I know, do not have a great workaround. I’m posting this to catalog some of the shortcomings of Julia’s C interop, learn about ways to avoid some of these issues, and hopefully start a conversation on how the already-great C interop can be improved further.

Structs with large inline arrays

Consider the following C struct:

struct _Foo
{
    int bar[10000];
};
typedef struct _Foo Foo;

To get a Julia struct with the same memory layout the docs suggest:

struct Foo
    bar::NTuple{10000, Cint} # equivalently: SVector{10000, Cint}
end

Instantiating a Foo takes a seemingly infinite amount of time (10+ minutes on my reasonably fast machine for the above example) and calling getindex on Foo.bar appears to be O(n) if Foo is mutable. This seems to be a well-known issue amongst the Julia devs (#24596, #31681). My current approach for fields like this is the wrap them using unsafe_wrap + fieldoffset when accessing them.

It seems like the proposal in #31681 would entirely fix this issue.

Large immutable structs.

Often times I’ll come across C structs with a 50-250 fields, many of them large as in the above example or are deeply nested structs-of-structs. If the equivalent Julia struct is defined as immutable, then the compile time for instantiating such a struct blows up. Declaring it as mutable drastically cuts compile time. This becomes even more of an issue when one of these humongous structs is a field of another struct. In those cases, the large struct must be declared as immutable in order to get the right memory layout. This has been brought up on Discourse before.

I’m very curious why there’s such a large disparity in compile time, so please share your knowledge if you know why! :slight_smile:

Wrapping C-allocated structs.

Some of the libraries I’ve worked with require that the library allocate and manage the memory for these large, complicated, and deeply-nested C structs. Given a Ptr{GargantuanStruct} returned by the C library, my only choice is unsafe_load(Ptr{GargantuanStruct}), which makes an expensive copy. This also complicates memory management as I have to keep around the pointer returned from the C library to later deallocate the memory (rather than passing a pointer to the value returned by unsafe_load, which would result in a double-free or plain old segfault).

I would love for the ability to extend unsafe_wrap to more than just Array so that I could directly wrap the C-allocated memory.

Mutating immutable structs.

This has already been widely discussed here, so I won’t rehash the conversation. I’ve been making use of the discussed solutions like @jw3126’s Setfield.jl and RefField from #21912 which seem to work quite well.

Using Julia’s Enum with C

Julia’s Enum doesn’t allow repeated values like C. It’d be nice if Enum was C-compatible, but that would change the current semantics of Enum so I’m guessing that’s not an option. CEnum.jl addresses this already, so this isn’t as much of a problem as the above issues.

C structs with array fields

This isn’t so much a limitation as it is a feature that would be incredibly convenient to have. Consider the following:

struct _Foo
{
    int* bar; // a (n x m) matrix
};
typedef struct _Foo Foo;

and in Julia:

struct CFoo
    bar::Ptr{Cint}
end

No one wants to deal with Ptr, so I’ll typically create a nicer version of Foo:

struct jlFoo
    ref::Base.RefValue{CFoo}
    bar::Matrix{Cint} # the result of unsafe_wrap(Array, jlFoo.ref[].bar, (m, n))
end

This requires a fair amount of boilerplate code (which thanks to Julia’s extensive meta-programming abilities I’ve been able to largely automate). Instead, it would be awesome if it were possible to magically wrap a Ptr{CFoo} with:

struct MagicFoo
    bar::Matrix{Cint}
end

Clearly that doesn’t have the same memory layout as CFoo and this seems non-trivial abstract away, but a man can dream, can’t he?

Apologies for big wall of text! Those are all the pain points I’ve come across. I’m eager to learn more about why some of these limitations exist, better ways to avoid them, and how someone like me could perhaps help advance Julia’s C interop capabilities.

EDIT: For completeness-sake, I have to mention Blobs.jl which eases the pain of some of these issues by facilitating unsafe_store!/unsafe_load for Ptrs to various data structures.

Thanks!

Colin

26 Likes

(Just FYI Setfield.jl is not my package, although I’m a big fan of it :slight_smile: )

1 Like

My mistake! Thanks for pointing that out (now fixed). I suppose I just mixed the two of you up after working on the mutable lenses (which has been working great, by the way!).

1 Like

I have also noted some of the issues with interfacing C from Julia at Using C libraries from Julia.

Currently, I am in the process of developing CBinding.jl to allow for a clean port of all C constructs into Julia. It is still a WIP, but it is a very capable package that is getting close to 1.0.

4 Likes

I’d add to this list:

  • Dealing with union types
  • Manually messing with alignment
  • Making @enum just support repeated fields for better IDE support (julia linters can’t see through @cenum so you get undefined symbol errors when using it)

CBinding.jl is for sure great for this, but it feels weird to me using brace syntax like that inside julia.

edit: Another big pain point for me is that I can’t just create something on the stack then get a pointer to it, a super common idiom in big APIs like Vulkan. It also usually means I’m heap allocating when trying to do something performance critical like interfacing with graphics hardware, which the compiler cleans up most of the time but it’s not a great experience.

This is a broader issue I have with julia though, of not being able to do mutation on the stack without doing lots of Setfield.jl type dances.

X-refs:

1 Like