I’m stumped, and I’m not sure if my problem is a lack of understanding of the L-BFGS implementation or the Optim.jl interface.
I’m trying to figure out how to create a preconditioner such that it is equivalent to the approximate Hessian that L-BFGS built up until it converged on a previous input value.
The goal is to use the previous approximate Hessian to increase convergence while bifurcating. I’m following a paper that rolled their own L-BFGS in C++, and their code isn’t available. The paper says that by using the previous approximate Hessian they decreased the number of iterations by factors of 10-20 in many cases, and thus I’m really interested in this optimization. I’m sure if this is relevant, but they also store 2x the number of columns in the Hessian of Hessian updates; which is what I set m
to in the LBFGS options.
I was considering using precondprep
to update the preconditioner, but since precondprep
only passes the current value of x
this would require me re-computing the Gradient; which involves solving a PDE and a handful of FFTs. So, it doesn’t make sense to increase the number of computations of the most expensive routine if its already been computed.
I’m hoping that I’m just missing something in the docs or source, but I’ve poured over both and the forums already, so hopefully someone knows how to do this.
Let me know if I need to clarify anything. Thanks!