Nested and different AD methods altogether: How to add AD calculations inside my loss function when using neural differential equations?

Forward differentiation here does not break but ignores the gradients with respect to the derivative term,

dUdx = map(x -> ForwardDiff.jacobian(x -> U([x[1], 1.0], θ, st)[1], [x]), steps_reg)

gives

Warning: `ForwardDiff.jacobian(f, x)` within Zygote cannot track gradients with respect to `f`,
│ and `f` appears to be a closure, or a struct with fields (according to `issingletontype(typeof(f))`).

@ChrisRackauckas I noticed this is pretty much the same problem reported in the post Gradient of Gradient in Zygote but here I am interested in reverse-over-forward differentiation. Also a similar thread in Issue with Zygote ober ForwardDiff-derivative. However, it is not clear for me what is the recommended solution for this cases, if there is any yet. I noticed the posts are a little bit old, so maybe some of their contents may be outdated. Do I need to define a new rrule() for this problem in order to make this work?