Reverse-Mode AD VJP choices all failed


I am running a neural network with Flux, Zygote, and Optimization libraries. The code is rather long and I have not created a MWE yet (which could take quite a while). The code ran about 800 iterations without trouble, and then all of a sudden generates the error:

 Warning: Reverse-Mode AD VJP choices all failed. Falling back to numerical VJPs
└ @ SciMLSensitivity ~/.julia/packages/SciMLSensitivity/6YVpi/src/concrete_solve.jl:115
┌ Warning: dt(9.536743e-7) <= dtmin(9.536743e-7) at t=0.16458714. Aborting. There is either an error in your model specification or the true solution is unstable.
└ @ SciMLBase ~/.julia/packages/SciMLBase/hLrpl/src/integrator_interface.jl:529
┌ Warning: Endpoints do not match. Return code: DtLessThanMin. Likely your time range is not a multiple of `saveat`. sol.t[end]: 0.16458714, ts[end]: 12.0
└ @ SciMLSensitivity ~/.julia/packages/SciMLSensitivity/6YVpi/src/concrete_solve.jl:1401
ERROR: DimensionMismatch: dimensions must match: a has dims (Base.OneTo(62),), b has dims (Base.OneTo(61),), mismatch at 1

What might make this happen? It is hard to believe that if auto-differentiation suddenly stops working, that a numerical VJP will.

Could this issue be the result of a stiff system of equations, in which case, I should change the numerical integrator? Anybody come across this error? Thanks.


Looks like your function just fails because there’s a dimension mismatch error?

Because the integration aborted early. If you solve it at those parameters I assume you will see the same thing.

Thank you, Chris,

[quote=“erlebach, post:1, topic:94929”]
ERROR: DimensionMismatch: dimensions must match: a has dims (Base.OneTo(62),), b has dims (Base.OneTo(61),), m

My time step is not 1.e-7. That would imply the formation of a singularity with a commensurate decrease of the timestep, which eventually went below some prescribed threshold.

I will delve further. What is frustrating is that it takes time for this error to occur, which makes debugging rather painful.