Hi,
so this is a super low-level question for the people who know about or are working on the Julia kernel.
tl;dr
I need/want to create a “foreign type” MPtr
(via the C function jl_new_foreign_type
) with no super type, and then for reasons explained below, later modify it by swizzling its super pointer to point at some abstract type. Before this swizzling happens, the only part of the Julia kernel that is “made aware” of MPtr
and allocations with it as type are the Julia GC, and of course the code creating the type; but e.g. no Julia functions or methods involving MPtr
in their signature are ever declared before this (so code involving method dispatch should never have seen that type, to my understanding). Of course this is somewhat evil, but in a prototype, it seems to work perfectly well.
Questions:
-
Is this actually working? Or am just fooled into it by not yet having tested the right things? In other words, can any Julia kernel expert think of ways this could, say, corrupt internal data structures? I tried to trace everything
jl_new_foreign_type
(and code it calls, includingjl_new_datatype
) does, and my impression was that actually no reference to that new type is retained by the Julia kernel, but I may easily have missed something. I couldn’t find anything that would seem to care about this isolated type which is never passed to any Julia code beyond the GC. But I may easily have missed lots of things sigh. Anyway, if I am right, what I am proposing above should be fine, no? -
How likely is this to keep working? Sorry, I know that’s rather vague; perhaps it’d be better phrased as: Can you think of any mechanism that might at some point be introduced (e.g. as an optimization) that could break this?
-
Can anybody think of an alternative that is less evil? For this you’ll need to understand the actual problem I have to solve, and so I am afraid you’ll have to read my ramblings below to answer it.
For point 2, I was wondering whether for example Julia might introduce (or already has!) a list of “all types without super type” (I don’t have a clue why it might do that; perhaps for some clever optimizations?). If that was the case, then of course my hack would break this invariant. But my hope is that this won’t happen, or that perhaps at least for “foreign types”, an exception could be made (so that e.g. they are not added to that list-of-types-with-supertypes). We’d of course be willing to contribute patches to the Julia kernel for any such thing; but this hinges on the questions on (a) whether it would even be possible w/o hurting something, and (b) whether it would be acceptable in general.
Long version:
Some background
This question is motivated by our work on GAP.jl which is an interface between Julia and the GAP computer algebra system, as part of the OSCAR project. To enable this interface, we modified GAP to be able to use the Julia garbage collector (GC) instead of its own GC. The super nice Julia team merged some low-level patches from us into Julia 1.1 to make that possible; in particular the code in julia_gcext.h
and the notion of a “foreign type” (injected into the Julia runtime via the C function jl_new_foreign_type
). That allows us to declare a few low-level types that we need to make things work; the most important one of these, and the only only visible to regular users, is ForeignGAP.MPtr
, or MPtr
for short. Objects of this type are exposed from GAP to Julia.
One other important point: this is actually a bidirectional integration. It can be used in two ways (or three, depending on how you count, but I’ll focus on the two relevant ones)
- To access GAP from Julia:
using GAP
launches the GAP interpreter, which then among other things injects the foreign typeMPtr
into the Julia runtime; finally,GAP.jl
loads the GAP packageJuliaInterface
which provides a few further C level functions to complete the interface between the two systems - To access Julia from GAP: You start GAP (compiled against Julia), which during its startup very early on also initializes Julia (via
jl_init
) – it has to, because it uses the Julia GC. It then also injects theMPtr
type early on- at this point, the user might stop, and not interact with Julia further
- or you can load the GAP package
JuliaInterface
to get full access to all Julia features, packages etc. That package during its startup detects thatGAP.jl
is not yet loaded in Julia, and loads it.
Summary
There are two possible sequences in which things get loaded and initialized:
- Julia → GAP.jl → GAP → GAP creates
MPtr
type → JuliaInterface - GAP → Julia → GAP creates
MPtr
type → JuliaInterface → GAP.jl
What’s the original problem?
Our Julia code in GAP.jl
needs to interact with GAP objects of type MPtr
. But in scenario 1, that type is not yet known to Julia at the time it tries to load/(pre)compile GAP.jl: after all, MPtr
only gets injected when GAP is initialized, which is done by GAP.jl’s __init__
function – but that’s not yet been run, as we are trying to (pre)compile GAP.jl. Boom. Hence, no references to MPtr
are allowed in the GAP.jl Julia code, other than in purely dynamic constructs (we have one use of Base.MainInclude.eval(:(ForeignGAP.MPtr))
in __init__
right now, but that’s just for compatibility with Julia < 1.3, so once we switch to requiring Julia 1.3, it can go).
How did we achieve this? Well, we introduced an empty abstract type GapObj
and then used that as super type for MPtr
. This way, our Julia code can reference GapObj
instead of MPtr
, and be precompiled. Easy peasy.
Except there is also scenario 2 to consider… Our new sequences of initialization look like this:
- Julia → GAP.jl → GAP.jl creates
GapObj
type → GAP → GAP createsMPtr
type with super typeGapObj
→ JuliaInterface - GAP → Julia → GAP creates
MPtr
type → JuliaInterface → GAP.jl → GAP.jl createsGapObj
type
Ooops: MPtr
is created before GapObj
. Can’t have a type that does not yet exist as super type, can we? And we can’t fix this, because we are rather restricted on the order things are loaded: We must have:
- GAP before JuliaInterface
- Julia before GAP.jl
- JuliaInterface must either be loaded before GAP.jl, or during initialization of GAP.jl
- but GAP cannot load packages like
JuliaInterface
before it has fully initialized its memory manager, and that already requires the typeMPtr
.
Our current “solution” (and the new problems it causes)
So to overcome this, we decided to try and break the cycle, by introducing a tiny Julia package GAPTypes.jl
which basically just consists of the definition of the type GapObj
. With that, we get these initialization sequences:
- Julia → GAPTypes.jl (loaded as dep of GAP.jl) → GAP.jl → GAP → GAP creates
MPtr
type with super typeGAPTypes.GapObj
→ JuliaInterface - GAP → Julia → GAPTypes.jl (loaded by the GAP kernel) → GAP creates
MPtr
type with super typeGAPTypes.GapObj
→ JuliaInterface → GAP.jl
OK, problem solved, right? Well, yes if all packages in Julia were always loaded into the same global namespace. But as you folks know better than me, that’s not the case; there is in general a difference between the GAPTypes.jl
loaded on the global level (e.g. triggered by the GAP kernel) vs. one loaded as dependency of a package like GAP.jl
. In fact GAPTypes.jl
might not even be installed in the global Julia environment, meaning sequence 2 above could fail. To workaround this, we did some pretty evil things (please don’t hate on me for this, I never liked this, it was simply a quick & dirty hack to get things working now, not meant as a permanent solution), namely GAP actually installs GAPTypes.jl
into the global environment during its compile time (yes, I know, this fragile and bad for many reasons sigh), and also GAP.jl in its deps/build.jl
tries to do that (yup, yup, gross, nasty, evil – please don’t tar & feather us we want to reform our sinning ways, that’s why I am writing this post)
What else can we do?
Finally we arrived at the idea described at the top: we drop GAPTypes.jl again, and modify sequence 2 as follows: when GAP starts, it creates the Julia foreign type MPtr
with no super type. Then once GAP.jl loads and created the abstract type GapObj
, it uses a ccall
to notify the GAP kernel about the new type; the GAP kernel then swizzles the super
pointer of MPtr
to point as GapObj
. Prior to this, no MPtr
instances created in the GAP kernel was ever visible to the Julia language level; only to the “plumbing” (the GC, and code creating datatypes).
Of course this is still evil, but I think still considerably less evil then messing with the global environment and stuff (but see questions 1 and 2 at the top of this post). That said, if anybody has suggestions how we could solve our problem in a different way, I am all ears (that’s my question 3 at the top).