Optim L-BFGS() - Can I save internal state?

question
optim

#1

Oftentimes, L-BFGS has not converged when it stops after a few hours of crunching away because I specified too few iterations… I then would like to continue without having to restart from scratch, i.e. without L-BFGS rebuilding the approximation to the Hessian. Is there an option to save the internal state when L-BFGS quits, or is such option being contemplated?

Thanks!


#2

Yes, that is possible. Let me cook up a simple example.

Edit 2: Let me warn you! This is unexported for a reason. We feel free to change the storage formats, names, etc of internally used types and objects.

Edit:

So this is in the more advanced section of the Optim store. You need to create an objective instance, a method instance, options instance, and the crucial part: the state instance. Then, you simply supply them as shown below, and you will have all the cache variables from the optimization routine at hand, and then you can simply pass everything once more, and it will continue where it left off (of course you have to adjust your starting point).

julia> using Optim

julia> prob = Optim.UnconstrainedProblems.examples["Himmelblau"]
Optim.UnconstrainedProblems.OptimizationProblem("Himmelblau", Optim.UnconstrainedProblems.himmelblau, Optim.UnconstrainedProblems.himmelblau_gradient!, Optim.UnconstrainedProblems.himmelblau_hessian!, [2.0, 2.0], [3.0, 2.0], true, true)

julia> optimize(prob.f, prob.g!, prob.initial_x, LBFGS())
Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [2.0,2.0]
 * Minimizer: [3.000000000002841,2.0000000000044156]
 * Minimum: 8.809762e-22
 * Iterations: 7
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 3.93e-07 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = 3.67e+09 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 2.99e-10 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 22
 * Gradient Calls: 22

julia> options = Optim.Options()
Optim.Options{Float64,Void}(1.0e-32, 1.0e-32, 1.0e-8, 0, 0, 0, false, 1000, false, false, false, 1, nothing, NaN)

julia> x0 = prob.initial_x
2-element Array{Float64,1}:
 2.0
 2.0

julia> obj = OnceDifferentiable(prob.f, prob.g!, x0)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(Optim.UnconstrainedProblems.himmelblau, Optim.UnconstrainedProblems.himmelblau_gradient!, NLSolversBase.fg!, 0.0, [0.0, 0.0], [NaN, NaN], [NaN, NaN], [0], [0])

julia> m = LBFGS()
Optim.LBFGS{Void,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64},Optim.##73#75}(10, LineSearches.InitialStatic{Float64}
  alpha: Float64 1.0
  scaled: Bool false
, LineSearches.HagerZhang{Float64}
  delta: Float64 0.1
  sigma: Float64 0.9
  alphamax: Float64 Inf
  rho: Float64 5.0
  epsilon: Float64 1.0e-6
  gamma: Float64 0.66
  linesearchmax: Int64 50
  psi3: Float64 0.1
  display: Int64 0
, nothing, Optim.#73, Optim.Flat(), true)

julia> options = Optim.Options()
Optim.Options{Float64,Void}(1.0e-32, 1.0e-32, 1.0e-8, 0, 0, 0, false, 1000, false, false, false, 1, nothing, NaN)

julia> obj = OnceDifferentiable(prob.f, prob.g!, x0)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(Optim.UnconstrainedProblems.himmelblau, Optim.UnconstrainedProblems.himmelblau_gradient!, NLSolversBase.fg!, 0.0, [NaN, 6.92786e-310], [NaN, NaN], [NaN, NaN], [0], [0])

julia> lbfgsstate = Optim.initial_state(m, options, obj, x0)
Optim.LBFGSState{Float64,1,2,Array{Float64,1}}([2.0, 2.0], [6.92786e-310, 6.92786e-310], [0.0, 0.0], [6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310], [6.92786e-310 6.92788e-310 … 6.92786e-310 6.92788e-310; 6.92786e-310 6.92788e-310 … 6.92788e-310 6.92788e-310], [6.92785e-310 6.92785e-310 … 6.92785e-310 6.92785e-310; 6.92785e-310 6.92785e-310 … 6.92785e-310 6.92785e-310], [1.0738e-319, 6.92788e-310], [NaN, 6.92788e-310], [6.92786e-310, 6.92788e-310], NaN, [1.0739e-319, 6.92788e-310], [6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310], 0, [NaN, 6.92788e-310], NaN, [6.92786e-310, 6.92786e-310], 1.0, false, LineSearches.LineSearchResults{Float64}(Float64[], Float64[], Float64[], 0))

julia> optimize(obj, x0, m, options, lbfgsstate)
Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [2.0,2.0]
 * Minimizer: [3.000000000002841,2.0000000000044156]
 * Minimum: 8.809762e-22
 * Iterations: 7
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 3.93e-07 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = 3.67e+09 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 2.99e-10 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 22
 * Gradient Calls: 22

julia> lbfgsstate.dg_history
2×10 Array{Float64,2}:
 21.5201  18.1824    5.41175  -3.3149    0.20521   -0.00456848  6.92785e-310  6.92785e-310  6.92785e-310  6.92785e-310
 19.439    4.52699  -3.58513  -2.25636  -0.121299  -0.00316439  6.92785e-310  6.92785e-310  6.92785e-310  6.92785e-310