Optim L-BFGS() - Can I save internal state?

egcjm · January 25, 2018, 1:39am

Oftentimes, L-BFGS has not converged when it stops after a few hours of crunching away because I specified too few iterations… I then would like to continue without having to restart from scratch, i.e. without L-BFGS rebuilding the approximation to the Hessian. Is there an option to save the internal state when L-BFGS quits, or is such option being contemplated?

Thanks!

pkofod · January 25, 2018, 11:32am

Yes, that is possible. Let me cook up a simple example.

Edit 2: Let me warn you! This is unexported for a reason. We feel free to change the storage formats, names, etc of internally used types and objects.

Edit:

So this is in the more advanced section of the Optim store. You need to create an objective instance, a method instance, options instance, and the crucial part: the state instance. Then, you simply supply them as shown below, and you will have all the cache variables from the optimization routine at hand, and then you can simply pass everything once more, and it will continue where it left off (of course you have to adjust your starting point).

julia> using Optim

julia> prob = Optim.UnconstrainedProblems.examples["Himmelblau"]
Optim.UnconstrainedProblems.OptimizationProblem("Himmelblau", Optim.UnconstrainedProblems.himmelblau, Optim.UnconstrainedProblems.himmelblau_gradient!, Optim.UnconstrainedProblems.himmelblau_hessian!, [2.0, 2.0], [3.0, 2.0], true, true)

julia> optimize(prob.f, prob.g!, prob.initial_x, LBFGS())
Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [2.0,2.0]
 * Minimizer: [3.000000000002841,2.0000000000044156]
 * Minimum: 8.809762e-22
 * Iterations: 7
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 3.93e-07 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = 3.67e+09 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 2.99e-10 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 22
 * Gradient Calls: 22

julia> options = Optim.Options()
Optim.Options{Float64,Void}(1.0e-32, 1.0e-32, 1.0e-8, 0, 0, 0, false, 1000, false, false, false, 1, nothing, NaN)

julia> x0 = prob.initial_x
2-element Array{Float64,1}:
 2.0
 2.0

julia> obj = OnceDifferentiable(prob.f, prob.g!, x0)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(Optim.UnconstrainedProblems.himmelblau, Optim.UnconstrainedProblems.himmelblau_gradient!, NLSolversBase.fg!, 0.0, [0.0, 0.0], [NaN, NaN], [NaN, NaN], [0], [0])

julia> m = LBFGS()
Optim.LBFGS{Void,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64},Optim.##73#75}(10, LineSearches.InitialStatic{Float64}
  alpha: Float64 1.0
  scaled: Bool false
, LineSearches.HagerZhang{Float64}
  delta: Float64 0.1
  sigma: Float64 0.9
  alphamax: Float64 Inf
  rho: Float64 5.0
  epsilon: Float64 1.0e-6
  gamma: Float64 0.66
  linesearchmax: Int64 50
  psi3: Float64 0.1
  display: Int64 0
, nothing, Optim.#73, Optim.Flat(), true)

julia> options = Optim.Options()
Optim.Options{Float64,Void}(1.0e-32, 1.0e-32, 1.0e-8, 0, 0, 0, false, 1000, false, false, false, 1, nothing, NaN)

julia> obj = OnceDifferentiable(prob.f, prob.g!, x0)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(Optim.UnconstrainedProblems.himmelblau, Optim.UnconstrainedProblems.himmelblau_gradient!, NLSolversBase.fg!, 0.0, [NaN, 6.92786e-310], [NaN, NaN], [NaN, NaN], [0], [0])

julia> lbfgsstate = Optim.initial_state(m, options, obj, x0)
Optim.LBFGSState{Float64,1,2,Array{Float64,1}}([2.0, 2.0], [6.92786e-310, 6.92786e-310], [0.0, 0.0], [6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310], [6.92786e-310 6.92788e-310 … 6.92786e-310 6.92788e-310; 6.92786e-310 6.92788e-310 … 6.92788e-310 6.92788e-310], [6.92785e-310 6.92785e-310 … 6.92785e-310 6.92785e-310; 6.92785e-310 6.92785e-310 … 6.92785e-310 6.92785e-310], [1.0738e-319, 6.92788e-310], [NaN, 6.92788e-310], [6.92786e-310, 6.92788e-310], NaN, [1.0739e-319, 6.92788e-310], [6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310, 6.92786e-310], 0, [NaN, 6.92788e-310], NaN, [6.92786e-310, 6.92786e-310], 1.0, false, LineSearches.LineSearchResults{Float64}(Float64[], Float64[], Float64[], 0))

julia> optimize(obj, x0, m, options, lbfgsstate)
Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [2.0,2.0]
 * Minimizer: [3.000000000002841,2.0000000000044156]
 * Minimum: 8.809762e-22
 * Iterations: 7
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 3.93e-07 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = 3.67e+09 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 2.99e-10 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 22
 * Gradient Calls: 22

julia> lbfgsstate.dg_history
2×10 Array{Float64,2}:
 21.5201  18.1824    5.41175  -3.3149    0.20521   -0.00456848  6.92785e-310  6.92785e-310  6.92785e-310  6.92785e-310
 19.439    4.52699  -3.58513  -2.25636  -0.121299  -0.00316439  6.92785e-310  6.92785e-310  6.92785e-310  6.92785e-310

Topic		Replies	Views
Preconditioner for L-BFGS in Optim.jl Optimization (Mathematical) question , optimization , numerics	17	3377	May 11, 2020
Can I "warm start" Optim.jl's BFGS more cleanly? Optimization (Mathematical) optimjl	1	273	January 27, 2023
Possible to used cached values in Optim.optimize? General Usage optimization	17	1236	November 30, 2019
Optim: loading Information from previous runs Optimization (Mathematical)	13	944	February 18, 2022
Save the optimization results while the optimization is running (Optim.jl) Optimization (Mathematical) optim	7	196	September 11, 2024

Optim L-BFGS() - Can I save internal state?

Related topics