An Idea for Optimization as if Solving ODE

Yeah, there is a bunch of papers doing things like this. It’s just a bad idea: ODE solvers are made to follow a curve, optimizers don’t care about the curve but just the endpoint. The order, adaptivity etc. are just not relevant. Furthermor the ODE you’re using is essentially gradient descent, which is a bad optimization method. There are tricks you can do in optimization inspired from the ODE viewpoint (eg inertia) so it’s certainly a good perspective to keep in mind, but it’s generally a bad idea to use ODE solvers to minimize things.

10 Likes