Why is Optim.optimize() throwing an exception if I provide initial_invH?

marcus-waldman · April 23, 2021, 12:41pm

I’m trying to implement the EM algorithm to estimate parameters in a large model The EM algorithm requires a lot of back and forth updating some parameters of interest and then optimizing an objective function based on the current state of the parameter values. The latter step I am completing using Optim.optimize().

I am trying to provide a “current” best guess of the (inverse) Hessian matrix when calling Optim.optimize() and using BFGS(). The specific problem I have is that passing this best-guess initial Hessian results in a methods exception being thrown, i.e.:

opt2 = optimize(func, Optim.minimizer(opt1), 
                BFGS(initial_invH =  opt1.trace[end].metadata["~inv(H)"]), 
                Optim.Options(show_trace = true) )

#note opt1 comes from the optimization at a previous iteration in the routine.

results in

MethodError: objects of type Matrix{Float64} are not callable
Use square brackets for indexing an Array.
initial_state(method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Matrix{Float64}, Nothing, Flat}, options::Optim.Options{Float64, Nothing}, d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, initial_x::Vector{Float64}) at bfgs.jl:97
optimize(d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, initial_x::Vector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Matrix{Float64}, Nothing, Flat}, options::Optim.Options{Float64, Nothing}) at optimize.jl:35
top-level scope at mwe-provide-invH.jl:28
eval at boot.jl:360 [inlined]

A minimum working example is below for reproducibility. It is not the EM algorithm I am implementing in my actual code, but the same exception pops up. Any assistance is, of course, greatly appreciated as being able to provide an initial Hessian would greatly speed up the execution of my code.

## Following minimum working example is a modified version of what appears in 
# https://julianlsolvers.github.io/Optim.jl/stable/#examples/generated/maxlikenlm/
using Optim, NLSolversBase, Random
using LinearAlgebra: diag
Random.seed!(0)

n = 500                             # Number of observations
nvar = 2                            # Number of variables
β = ones(nvar) * 3.0                # True coefficients
x = [ones(n) randn(n, nvar - 1)]    # X matrix of explanatory variables plus constant
ε = randn(n) * 0.5                  # Error variance
y = x * β + ε;                      # Generate Data

function Log_Likelihood(X, Y, β, log_σ)
    σ = exp(log_σ)
    llike = -n/2*log(2π) - n/2* log(σ^2) - (sum((Y - X * β).^2) / (2σ^2))
    llike = -llike
end

func = OnceDifferentiable(vars -> Log_Likelihood(x, y, vars[1:nvar], vars[nvar + 1]),
                           ones(nvar+1); autodiff=:forward);
                           
opt1 = optimize(func, ones(nvar+1), 
                BFGS(), 
                Optim.Options(store_trace = true, extended_trace=true, iterations = 3))

opt2 = optimize(func, Optim.minimizer(opt1), 
                BFGS(initial_invH =  opt1.trace[end].metadata["~inv(H)"]), 
                Optim.Options(show_trace = true) )

Tamas_Papp · April 23, 2021, 1:40pm

Thanks for taking the time to construct an MWE. I think that constructor expects a closure (you can provide an initial inverse Hessian that depends on x), so something like

opt2 = optimize(func, Optim.minimizer(opt1),
                BFGS(initial_invH = _ -> opt1.trace[end].metadata["~inv(H)"]),
                Optim.Options(show_trace = true) )

would work (note the _ ->).

marcus-waldman · April 23, 2021, 2:30pm

Thanks @Tamas_Papp. This worked. Now that I know what I am looking for, I can see how you arrived at that from the documentation ?BFGS().

Constructor

BFGS(; alphaguess = LineSearches.InitialStatic(),
linesearch = LineSearches.HagerZhang(),
initial_invH = x → Matrix{eltype(x)}(I, length(x), length(x)),
manifold = Flat())

I wonder if there is a way for applied users to contribute to the documentation to provide concrete examples within the documentation, so as to elaborate on arguments. I really like Julia and would love to promote it to applied researchers in my field as an alternative to R, but I think they would have a difficult time without more explicit handholding in the documentation and help files. I know developers are tied up on other important things, however.

Tamas_Papp · April 25, 2021, 8:08am

Sure, just make a pull request on Github. If you only want to edit docstrings, maybe you can do that using the web interface.

pkofod · April 25, 2021, 10:06am

The contribution would be very welcome

Topic		Replies	Views
Optim: return inverse Hessian General Usage optim , optimization	0	424	June 11, 2021
Highly Asymmetric Hessian after optimization Optimization (Mathematical)	26	1583	July 6, 2019
Extract inverse Hessian representation used by L-BFGS Optimization (Mathematical) question	1	855	October 6, 2018
Can I "warm start" Optim.jl's BFGS more cleanly? Optimization (Mathematical) optimjl	1	292	January 27, 2023
'Optim' fails when supplying gradient and Hessian New to Julia optim	5	324	May 2, 2023

Why is Optim.optimize() throwing an exception if I provide initial_invH?

Constructor

Related topics