Thanks for the review paper. But the early reactions supposed that I don’t know the field or just barely getting started. I know mostly how it evolved or that I should not use the steepest descent everywhere. My example is just a naive example for easiness and fast prototyping. I myself coded for quasi-Newton methods to some extent while taking optimization course but, of course, there are plenty of places for improving myself.

For the abondenment of SD, we can still make `du`

out of conjugate-gradient method or damped/momentum dynamics method. I just wanted to explore the idea of incorporating an ODE solver in between and get feedback on the research done similarly or some pointing to papers about it, not plain objections without concerete references as the first couple of responds.