If that’s the case, then I really think once should heed the advice of option a then (rewriting the code to not be conditionally active). It turns out that Enzyme’s activity analysis proved that a lot of code didn’t need to be differentiated (and was hindered by seemingly this?
Stacktrace:
[1] _trainmap
@ ~/.julia/packages/Optimisers/a4OnF/src/destructure.jl:114
[2] _Trainable_biwalk
@ ~/.julia/packages/Optimisers/a4OnF/src/destructure.jl:110
[3] ExcludeWalk
@ ~/.julia/packages/Functors/LbNAu/src/walks.jl:126
In particular it believes it is storing a non-differentiable matrix into a larger data structure which it thinks is differentiable (perhaps conservatively).
@mcabbott does this code ring any bells or give hints as to where something is going awry?