Iβm trying to use Ipopt via NLPModelsIpopt.jl in order to solve large trajectory optimization problems. It is essentially an optimal control problem, to minimize \int_{0}^{T} g(x, u(t)) \, dt subject to the constraints \dot{x} = f(x, u(t), t) where u(t) is the control to be optimized. I am using a Simpson-Hermite discretization. Due to the partially separable nature of the constraints, I can compute the constraint Jacobian and Hessian-of-Lagrangian in this case by computing the contribution from each timestep and piecing the contributions together manually. Thus, I generate fast code for the timestep derivatives using SymPyβs common subexpression elimination, and then I assemble these into the full sparse Jacobian/Hessian.
The final derivative oracle implementation is reasonably fast, able to fill ~1e7 nonzero Hessian entries in ~0.1s (although not allocation free). If I then use NLPModelsIpopt to solve the actual problem (Ipopt using HSL Ma97 solver), it works well for small problems around ~1e6 nonzero Hessian entries. However, as I scale up to problems with ~5e6 nonzero Hessian entries, Ipopt frequently hangs for extended periods during the solve, and eventually the kernel just kills the process due to out-of-memory error. Given that I need to ultimately be scaling up to problems of size ~1.5e7 nonzero Hessian entries, this is a big issue.
Does anyone have an idea as to what might be going wrong? Iβm happy to share my code as needed. Any help would be greatly appreciated!
Here are example logs from NLPModels and Ipopt. The slight discrepancy in number of Hessian nonzeros can be chalked up to the fixed variables in the problem β when the fixed variables are substituted many of the Hessian entries become structural zeros. Accounting for this, the number of nonzeros is correct.
At the start of the Ipopt solve, RAM usage is at ~20% of maximum (running on a laptop with 16GB RAM total), but over the course of the solve, the RAM usage balloons to ~80% before the kernel kills the process. This suggests that perhaps the garbage collector is not doing its job fast enough? My derivative oracles allocate many array views throughout the solve, and this is the main source of allocations; the timestep derivative computations themselves are nonallocating, but the views are necessary (?) to fill the problem Hessian/Jacobian/variable arrays.
Apart from that, you could try profiling the memory of each function to try to reduce. One thing you can do to avoid creating things inside the function calls is to store them inside the VarAssimProblem structurem, e.g. AugLagModel.jl. They wonβt get deallocated, but if your issue is allocating too many of them, this might help.