Optim.jl - get the hessian matrix from maximum likelihood estimation

There exists an example at Optim.jl’s tutorial, I copy the code here.

using Optim, NLSolversBase, Random
using LinearAlgebra: diag
Random.seed!(0);                            # Fix random seed generator for reproducibility

n = 500                             # Number of observations
nvar = 2                            # Number of variables
β = ones(nvar) * 3.0                # True coefficients
x = [ones(n) randn(n, nvar - 1)]    # X matrix of explanatory variables plus constant
ε = randn(n) * 0.5                  # Error variance
y = x * β + ε;                      # Generate Data

function Log_Likelihood(X, Y, β, log_σ)
    σ = exp(log_σ)
    llike = -n/2*log(2π) - n/2* log(σ^2) - (sum((Y - X * β).^2) / (2σ^2))
    llike = -llike
end

func = TwiceDifferentiable(vars -> Log_Likelihood(x, y, vars[1:nvar], vars[nvar + 1]),
                           ones(nvar+1); autodiff=:forward);

opt = optimize(func, ones(nvar+1))

parameters = Optim.minimizer(opt)

parameters[nvar+1] = exp(parameters[nvar+1])

numerical_hessian = hessian!(func,parameters)

var_cov_matrix = inv(numerical_hessian)

β = parameters[1:nvar]

temp = diag(var_cov_matrix)
temp1 = temp[1:nvar]

t_stats = β./sqrt.(temp1)

# This file was generated using Literate.jl, https://github.com/fredrikekre/Literate.jl

as you see, Log_Likelihood function accepts log\_\sigma as one of parameters, so the estimated parameters should also be log\_\sigma, however, if you change log\_\sigma to \sigma by parameters[nvar+1] = exp(parameters[nvar+1]), then the hessian matrix is not right, as Log_Likelihood 's input is log\_\sigma, am I right?

I can’t figure out why to take exponential to log\_\sigma before calculating hessian matrix.

I think you are right.
Try to make 2 functions:

function Log_Likelihood(X, Y, β, σ)
    llike = -n/2*log(2π) - n/2* log(σ^2) - (sum((Y - X * β).^2) / (2σ^2))
    llike = -llike
end
function Log_Likelihood_opt(X, Y, β, log_σ)
    σ = exp(log_σ)
    Log_Likelihood(X, Y, β, σ)
end

Optimize second, and find Hessian with first.

2 Likes

I think you are correct that the example is in error. It would be good to raise an issue at Optim.jl to call attention to the problem. The solution of using two functions is reasonable, but another way would be to use the delta method, which is often employed when computing transformations of estimated parameters.

Actually, the Hessian matrix for the linear regression model with normality is block diagonal, see https://www.cemfi.es/~arellano/Likelihood.pdf, page 7, for example. So, the confusion with the standard error of sigma will not affect the estimated standard errors of the coefficients, which is what the example focuses on. The t- statistics at the end of the example are correct, I believe. However, if one were to look at the estimated standard error of sigma, it would be incorrect.

Final note: the t-statistics are not correct, as can be verified by computing the OLS estimator. This is because the Hessian is computed using the estimate of sigma, rather than log sigma. I have submitted a PR to fix this.

1 Like