Direct collocation, stiff dynamics, and discrete adjoints — where does the instability go?

Hi all,

I’ve been experimenting with ModelingToolkit.jl and InfiniteOpt.jl for direct-collocation optimal control problems. In my model I am using smooth boolean logic that is based on hyperbolic functions quite extensively to capture some non linearities, and I have noticed that IPOPT converges many times in clearly suboptimal tranjectories. Given the above, I’m trying to reconcile something about stiff dynamics and discrete adjoints.

For a stiff linear system, forward Euler integration is unstable. Writing the same forward-Euler discretization as a global linear system (direct transcription) does not fix this — the solution still exhibits the same exponential growth.

In optimal control, however, we typically use implicit schemes (e.g. backward Euler / Radau), which stabilize the primal dynamics. But it’s well known that the corresponding discrete adjoint then behaves like forward Euler on the dual system (since the adjoint propagates via the transpose of the step map), and therefore has unstable eigenvalues.

My confusion is:

If forward Euler instability is not cured by “solving everything at once” for the primal, does it mean we have the same problem for the adjoint inside a direct collocation NLP? In other words, if the discrete adjoint corresponds to an unstable forward scheme, why doesn’t this cause the same kind of blow-up when solving the KKT system? Is what I absorved with the solver converging into clearly not optimal solutions related to this numerical stiffness of the dual dyanamics?

I understand that in collocation we don’t march the adjoint but instead solve a global boundary-value linear system — but I’m struggling to make this precise.

If anyone has a clean linear example or intuition to clarify this distinction, I’d really appreciate it.

Thanks!

No because it’s not pushing forwards through multiple time steps. Stability is a property of how error accumulates, not a property of error in a single step.

So why does the final state solution exhibit the same numerical damping characteristics as one would expect when integrating in a time marching way with the equivalent discretization?
For example if I have a highly non damped oscillator and simulate it with an implicit euler ode solver the accumulating error will introduce some numerical damping in the solution. The same damping distortion would be present in my final state if I solved my ODE with direct collocation even though there is no time marching involved.

I can’t weigh in as well as Chris on the numerical stability question, but as to the poor convergence of the trajectory optimization I would venture to guess that if you have discontinuities or strong nonlinearities you probably need adaptive refinement or very good initialization. It’s hard to overstate how sensitive optimizers can be to initial guesses for trajectory optimization, even with direct collocation.

For instance, maybe you can initialize with a simplified version of your system that’s linear or doesn’t have discontinuities or something like that. Or maybe there’s a way you can formulate the problem or constraints that’s closer to convex.

Scaling so that all the decision variables and constraints are O(1) can also make a big difference.