Why so many internals in C?

[sorry if this question does not belong here; feel free to move it]

I was wondering why so many internals are implemented in C and not conveniently made available in Julia. For example, take array.c / array.jl.

It would not be very hard to do something like

type jl_array_t 
    data::Ptr{Void} 
    flags::jl_array_flags_t #uint16
    elsize::UInt16
    offset::UInt32; # for 1-d only. does not need to get big.
    nrows::UInt64 #size_t
    max_size_ncols::UInt64 #no native union support in julia
end

This would make julia arrays much more transparent to users, and a lot of logic could move from “ccall array.c” to julia. Further, it would become unnecessary to store elsize (which should be known to julia from type inference, but is unknown to C), making room for an extra 16 bits of flags (we probably do not want to break the nice alignment to a half cache-line), or an extra user-available 32 bits of flags for multi-dimensional arrays (assuming that the source code comment is not lying, i.e. offset is not secretly a union of uint32 and some other flags).

Maybe LLVM would then be able to do some profitable extra optimizations (e.g. the C-side branches on whether the array contains bitstype or object references could probably be eliminated, since this should be known at compile-time).

So I guess that at some point a decision was made that julia code is not supposed to muck around its own implementation, even if much of the implementation could be done perfectly well in julia. Why was this decision made? Or is this historical, because no one took the time to re-implement/wrap internals in julia?

Especially modifying offset (if the user expects to add elements in front in the future) and reading max_size would be quite useful! Even more useful would of course be stack-allocated variants of array which reuse user-supplied data pointers/buffers (avoid allocations in inner loops/functions that need to work on variable-sized data, user would need to handle visibility to gc by hand).

3 Likes

Overall, JuliaLang/julia is actually surprisingly written in Julia itself as far as internals go.

As for the Array code specifically, it has been discussed before to move to Julia itself, it would just take a decent amount of time & effort. There are just other priorities right now to get to a 1.0 release. I remember there being some trickiness in getting the GC-interaction right, but not impossible.

1 Like

That’s actually bad, since whatever the user want to do with them is almost always invalid. All of the optimizations you’ve mentioned about arrays will not affect the performance of any array operations. They are only done in the slow path anyway.

No, for almost all of the implementation in C, it is currently impossible to implement in julia. A fully static subset with predictable runtime (e.g. gc, allocation, dispatch, codegen) behavior is necessary for that to be possible. It’s not a impossible task, but it’s a hard one and with very low priority.

I’ll repeat my comment from another thread that prepending element does not have performance issue and the user is not supposed to change or even see the offset.

Changing the representation of arrays is possible and likely. It won’t remove the necessity for C internals though.

No.

1 Like

A fully static subset with predictable runtime (e.g. gc, allocation, dispatch, codegen) behavior is necessary for that to be possible.

Ok, that’s a really convincing reason for keeping things separate.

Re your other points: I concede that the arguments for switching internals into julia are rather unconvincing.

user would need to handle visibility to gc by hand
No.

The problem is that it is at the moment pretty hard to write allocation-free code for inner loops, and in many cases it would be very easy to pre-allocate and keep-alive temporary variables (e.g. by writing references in some array), if there only was a way of telling julia to not bother about gc visibility in some cases.

For example, Subarray keeps alive its parent, hence it needs to be kept on the heap (it is not bitstype). However, at the same time it is immutable, and therefore non-trivial to reuse (aquire a pointer to the pre-allocated subarray, write there, hope that you understood the subarray code correctly). For this reason inner loops have problems working with subarrays, and one is tempted to tell julia to just segfault / install malware/ catch fire if a reference to the subarray outlives its parent.

(the pointer-write scheme should crash and burn if we unsafe_store! a reference to a young parent array into an old subarray and there is no one else to keep the parent alive? Even if we never use the subarray after its parent is freed, we may crash the garbage collector 10 minutes later if we don’t obsessively C_NULL the the parent field of the view before the parent gets deallocated? Hence this probably cannot be done safely in julia? And exposing the garbage collector interface directly to julia is low priority at best, and maybe just a bad idea for exactly the reasons you listed).

The difficulty (or undue obsession) of making inner loops allocation free might just be a beginner’s (i.e. my) problem, though.

Thank you for this explanation of the design choices!

This can be improved without exposing internals; see e.g. https://github.com/JuliaLang/julia/pull/12205

I’ll start by repeating that exposing internals or implement them in julia is unrelated to reducing allocation, at all.

Yes

No. We can already optimize some cases, just not all of them. Ref Much more aggressive alloc_elim_pass! by yuyichao · Pull Request #23240 · JuliaLang/julia · GitHub which should do a much better job eliding those allocations where I also mentioned some of the current limitations in the comments and commit messages. In fact, the fact that these are NOT user visible is what allow us to remove the allocation!

I’ll also point out that we now have full support for writing unsafe code in a safe way, using @gc_preserve. You can now explicitly specify what object you are holding a pointer to that’s not visible to julia. Again, in this case, it is important to not expose internal detail to the user so it is legal to do optimizations.