gdalle
April 5, 2024, 7:09pm
21
wsmoses:
The thunks are stateless and can be called as many times as you like.
However, the extra “tape” (really value cache) for the reverse pass may be different, depending on the function (e.g. if a shadow pointer is captured/overwritten/etc).
Thanks! So, to to sum up, I can initialize
forw, rev = autodiff_thunk(ReverseSplitWithPrimal, Const{typeof(f)}, ...)
and use those two thunks as many times as I like together . However, as soon as I do
tape, y, shadow_y = forw(Const(f), ...)
the tape is stateful. This means that if I run
rev(Const(f), ..., tape)
once, I’m good, but if I call rev
again, there’s no surefire way to trust the result. Correct?
Yeah I’m also thinking of adding that to DifferentiationInterface, but Enzyme is the only backend supporting it so we’ll see
Right essentially the thunks are necessary to implement Split-Mode, or to use Enzyme within ChainRules.
The reverse think is the pullback and the forward thunk is a yet not run forward function
1 Like
You may be able to query Julia’s effect analysis to see if something is captured.
Alternatively if desired enzyme can expose an API that says whether it is safe to redo with the same tape
1 Like
gdalle
April 7, 2024, 5:37am
24
Yeah but that’s related to my point: if I understand correctly, the pullback closure we generate that way is one-use only. So I think this is at odds with the ChainRules convention, where it can be reused