so this is a super low-level question for the people who know about or are working on the Julia kernel.
I need/want to create a “foreign type”
MPtr (via the C function
jl_new_foreign_type) with no super type, and then for reasons explained below, later modify it by swizzling its super pointer to point at some abstract type. Before this swizzling happens, the only part of the Julia kernel that is “made aware” of
MPtr and allocations with it as type are the Julia GC, and of course the code creating the type; but e.g. no Julia functions or methods involving
MPtr in their signature are ever declared before this (so code involving method dispatch should never have seen that type, to my understanding). Of course this is somewhat evil, but in a prototype, it seems to work perfectly well.
Is this actually working? Or am just fooled into it by not yet having tested the right things? In other words, can any Julia kernel expert think of ways this could, say, corrupt internal data structures? I tried to trace everything
jl_new_foreign_type(and code it calls, including
jl_new_datatype) does, and my impression was that actually no reference to that new type is retained by the Julia kernel, but I may easily have missed something. I couldn’t find anything that would seem to care about this isolated type which is never passed to any Julia code beyond the GC. But I may easily have missed lots of things sigh. Anyway, if I am right, what I am proposing above should be fine, no?
How likely is this to keep working? Sorry, I know that’s rather vague; perhaps it’d be better phrased as: Can you think of any mechanism that might at some point be introduced (e.g. as an optimization) that could break this?
Can anybody think of an alternative that is less evil? For this you’ll need to understand the actual problem I have to solve, and so I am afraid you’ll have to read my ramblings below to answer it.
For point 2, I was wondering whether for example Julia might introduce (or already has!) a list of “all types without super type” (I don’t have a clue why it might do that; perhaps for some clever optimizations?). If that was the case, then of course my hack would break this invariant. But my hope is that this won’t happen, or that perhaps at least for “foreign types”, an exception could be made (so that e.g. they are not added to that list-of-types-with-supertypes). We’d of course be willing to contribute patches to the Julia kernel for any such thing; but this hinges on the questions on (a) whether it would even be possible w/o hurting something, and (b) whether it would be acceptable in general.
This question is motivated by our work on GAP.jl which is an interface between Julia and the GAP computer algebra system, as part of the OSCAR project. To enable this interface, we modified GAP to be able to use the Julia garbage collector (GC) instead of its own GC. The super nice Julia team merged some low-level patches from us into Julia 1.1 to make that possible; in particular the code in
julia_gcext.h and the notion of a “foreign type” (injected into the Julia runtime via the C function
jl_new_foreign_type). That allows us to declare a few low-level types that we need to make things work; the most important one of these, and the only only visible to regular users, is
MPtr for short. Objects of this type are exposed from GAP to Julia.
One other important point: this is actually a bidirectional integration. It can be used in two ways (or three, depending on how you count, but I’ll focus on the two relevant ones)
- To access GAP from Julia:
using GAPlaunches the GAP interpreter, which then among other things injects the foreign type
MPtrinto the Julia runtime; finally,
GAP.jlloads the GAP package
JuliaInterfacewhich provides a few further C level functions to complete the interface between the two systems
- To access Julia from GAP: You start GAP (compiled against Julia), which during its startup very early on also initializes Julia (via
jl_init) – it has to, because it uses the Julia GC. It then also injects the
MPtrtype early on
- at this point, the user might stop, and not interact with Julia further
- or you can load the GAP package
JuliaInterfaceto get full access to all Julia features, packages etc. That package during its startup detects that
GAP.jlis not yet loaded in Julia, and loads it.
There are two possible sequences in which things get loaded and initialized:
- Julia -> GAP.jl -> GAP -> GAP creates
MPtrtype -> JuliaInterface
- GAP -> Julia -> GAP creates
MPtrtype -> JuliaInterface -> GAP.jl
What’s the original problem?
Our Julia code in
GAP.jl needs to interact with GAP objects of type
MPtr. But in scenario 1, that type is not yet known to Julia at the time it tries to load/(pre)compile GAP.jl: after all,
MPtr only gets injected when GAP is initialized, which is done by GAP.jl’s
__init__ function – but that’s not yet been run, as we are trying to (pre)compile GAP.jl. Boom. Hence, no references to
MPtr are allowed in the GAP.jl Julia code, other than in purely dynamic constructs (we have one use of
__init__ right now, but that’s just for compatibility with Julia < 1.3, so once we switch to requiring Julia 1.3, it can go).
How did we achieve this? Well, we introduced an empty abstract type
GapObj and then used that as super type for
MPtr. This way, our Julia code can reference
GapObj instead of
MPtr, and be precompiled. Easy peasy.
Except there is also scenario 2 to consider… Our new sequences of initialization look like this:
- Julia -> GAP.jl -> GAP.jl creates
GapObjtype -> GAP -> GAP creates
MPtrtype with super type
- GAP -> Julia -> GAP creates
MPtrtype -> JuliaInterface -> GAP.jl -> GAP.jl creates
MPtr is created before
GapObj. Can’t have a type that does not yet exist as super type, can we? And we can’t fix this, because we are rather restricted on the order things are loaded: We must have:
- GAP before JuliaInterface
- Julia before GAP.jl
- JuliaInterface must either be loaded before GAP.jl, or during initialization of GAP.jl
- but GAP cannot load packages like
JuliaInterfacebefore it has fully initialized its memory manager, and that already requires the type
Our current “solution” (and the new problems it causes)
So to overcome this, we decided to try and break the cycle, by introducing a tiny Julia package
GAPTypes.jl which basically just consists of the definition of the type
GapObj. With that, we get these initialization sequences:
- Julia -> GAPTypes.jl (loaded as dep of GAP.jl) -> GAP.jl -> GAP -> GAP creates
MPtrtype with super type
- GAP -> Julia -> GAPTypes.jl (loaded by the GAP kernel) -> GAP creates
MPtrtype with super type
GAPTypes.GapObj-> JuliaInterface -> GAP.jl
OK, problem solved, right? Well, yes if all packages in Julia were always loaded into the same global namespace. But as you folks know better than me, that’s not the case; there is in general a difference between the
GAPTypes.jl loaded on the global level (e.g. triggered by the GAP kernel) vs. one loaded as dependency of a package like
GAP.jl. In fact
GAPTypes.jl might not even be installed in the global Julia environment, meaning sequence 2 above could fail. To workaround this, we did some pretty evil things (please don’t hate on me for this, I never liked this, it was simply a quick & dirty hack to get things working now, not meant as a permanent solution), namely GAP actually installs
GAPTypes.jl into the global environment during its compile time (yes, I know, this fragile and bad for many reasons sigh), and also GAP.jl in its
deps/build.jl tries to do that (yup, yup, gross, nasty, evil – please don’t tar & feather us we want to reform our sinning ways, that’s why I am writing this post)
What else can we do?
Finally we arrived at the idea described at the top: we drop GAPTypes.jl again, and modify sequence 2 as follows: when GAP starts, it creates the Julia foreign type
MPtr with no super type. Then once GAP.jl loads and created the abstract type
GapObj, it uses a
ccall to notify the GAP kernel about the new type; the GAP kernel then swizzles the
super pointer of
MPtr to point as
GapObj. Prior to this, no
MPtr instances created in the GAP kernel was ever visible to the Julia language level; only to the “plumbing” (the GC, and code creating datatypes).
Of course this is still evil, but I think still considerably less evil then messing with the global environment and stuff (but see questions 1 and 2 at the top of this post). That said, if anybody has suggestions how we could solve our problem in a different way, I am all ears (that’s my question 3 at the top).