Zygote hooks to the compiler through generated functions.
According to my understanding, when calling gradient(foo, (1.0,1.0)) it
calls fval, pullback = forward(foo, 1.0, 1.0)
calls pullback(1.0)
The forward(foo, 1.0, 1.0) is a generated function which obtain a CodeInst of foo(Float64,Float64) and adapts it as shown for example in my previous post. Zygote returns the pullback as a callable structure
struct Pullback{S,T}
data::T
end
where S is the signature of differentiated structure (in this example S = Tuple{foo, Float64,Float64}) and data field contain pullbacks of individual statements within Foo.
Calling pullback(1.0) will hit another generated function which recursively calls pullbacks.
This is straightforward and kind of trivial (once one understands it).
Now I wonder, if Diffractor plans to do this differently with all the work in Core.Compiler.
I can construct a typed IRCode when writing a forward part and then I can wrap it to OpaqueClosure to execute it. Is this the plan Diffractor will use? Because relying on generated function, where I have to return untyped CodeInfo seems sad after I went through the pain of typing.
And what is the plan for Pullback? Is the idea also to wrap it to some structure similarly to Zygote?
What kind of optimization OpaqueClosure allows? Would they allow aggressive in-lining and elimination of dead code? I am very much guessing here.
Kind of? The primary reasons that Diffractor uses Opaque closures is to give give the optimizer semantic license to change the residual (aka tape aka scratch space) and make storage vs recompute tradeoffs at optimizer time. Since the data layout of opaque closures is opaque to the runtime, the optimizer is allowed to mess with it internally, without having to obey any semantic data layout constraints.
Currently, there’s essentially three ways that Diffractor is used:
Via generated function on untyped IR like Zygote
On typed IR via its own interpreter, generating code out via OpaqueClosure
As a constituent pass of a larger non-public code base
Only 1 and 3 are really tested. 2 is an incomplete experiment. The longer term goal is to replace it with some sort of compiler plugins mechanism, though we do not have a design at this time. The general problem with relying on the IR->OpaqueClosure pipeline directly is that they’re a bit too low level. You end up having to re-invent dynamic dispatch on top of it, which is obviously not what you want. They are useful for prototyping though.
Diffractor has this as a low level interface. I don’t have any immediate plans to do something more high level there, but I don’t have any problem with it either - it just depends on what users end up using.
Depends whether the opaque closure was constructed directly from IR or via :new_opaque_closure. If constructed from IR, there is very little additional optimization that is allowed to be done at the callsite, though it will generate an efficient specsig call. The best way to think about it is a function pointer. For :new_opaque_closure, if the compiler can see the allocation site, it will do all the usual inlining, constant propagation, etc. stuff as if it were a regular call.
If Diffractor is used Via generated function on untyped IR like Zygote, then its advantage (in my point of view) stems from the Category theory part, which I do not understand yet. This is fine, but not sure if it will not inherit the same issues as Zygote.
On typed IR via its own interpreter, generating code out via OpaqueClosure is something I think I do at the moment. Unfotunately, I do not have a good knowledge about interpreters at the moment and my current imagination is something like “a new interpreter of julia”. Since I am producing typed IR, I am invoking Julia’s type inference, which is likely very slow. What I find interesting is that you can catch some type instabilities during code generation, which might help to warn users about stupid mistakes.
I cannot comment 3.
I am constructing opaque closures by calling Core.OpaqueClosure, so I think the compiler I guess can see that.
Thanks a lot for answers. I will keep fiddling to try to make the solution workable. I know understand why you have told me few month ago that Diffractor needs people understanding Compilers.
Mostly, it might also have slightly faster compile times and the ChainRules integration is marginally less awkward, but this version of Diffractor was never really intended to be a massive improvement over Zygote, which is why nobody really ever pushed to get it adopted widely.