Retrieving the Hessian, and eventually calculating the standard errors using JuMP

After obtaining the optimum, I am interested in finding the Hessian. I then intend to use this to find standard errors. More like what is done here, but that code refers to old JuMP and I am having trouble translating it to new JuMP. If somebody has any illustrations with some other code, or is knowledgeable enough to translate the part on the Hessian–see below–I would really appreciate.

using JuMP
using Ipopt
using LinearAlgebra
using Random

Random.seed!(1234)

function data_generate(N,T)
    # generate data for linear model to test optimization
    N = convert(Int64,N)
    T = convert(Int64,T)
    n = N*T
    #X = cat(ones(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), dims = 2)
    X = cat(ones(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), dims = 2)

    # population parameters
    β = [ 2.15; 0.10;  0.50; 0.10; 0.75; 1.2; 0.10;  0.50; 0.10; 0.75; 1.2; 0.10;
      0.50; 0.10; 0.75; 1.2 ]
    # population standard deviation
    σ = 0.3

    # generating the response variable
    draw = 0 .+ σ*randn(n)
    y = X*β + draw

    return X, y, β, σ, n
end

function estimate_mle_jumpjl(X, y, n)
    mle = Model(Ipopt.Optimizer)
    @variable(mle, β̃[i=1:size(X,2)], start = 1)
    @variable(mle, σ̃>=0, start = 1)
    @NLobjective(mle, Max, (n / 2) * log(1 / (2π * σ̃^2)) - sum((y[i] - sum(X[i,k]*β̃[k] for k in 1:size(X,2)))^2 for i ∈ 1:n ) / (2σ̃^2))
    JuMP.optimize!(mle)
    display(JuMP.objective_value(mle))

    #--------------------------------
    #OLD JUMP WAY OF FINDING HESSIAN
    #--------------------------------
    #this_par = myMLE.colVal
    #m_eval = JuMP.JuMPNLPEvaluator(myMLE);
    #MathProgBase.initialize(m_eval, [:ExprGraph, :Grad, :Hess])
    #hess_struct = MathProgBase.hesslag_structure(m_eval)
    #hess_vec = zeros(length(hess_struct[1]))
    #numconstr = length(m_eval.m.linconstr) + length(m_eval.m.quadconstr) + length(m_eval.m.nlpdata.nlconstr)
    #dimension = length(myMLE.colVal)
    #MathProgBase.eval_hesslag(m_eval, hess_vec, this_par, 1.0, zeros(numconstr))
    #this_hess_ld = sparse(hess_struct[1], hess_struct[2], hess_vec, dimension, dimension)
    #hOpt = this_hess_ld + this_hess_ld' - sparse(diagm(diag(this_hess_ld)));
    #hOpt = -full(hOpt); #since we are maximizing
    #seOpt = sqrt(diag(full(hOpt)\eye(size(hOpt,1))));

    #---------------------------------
    #HERE I TRANSLATE THE OLD CODE: STUCK HERE!!!
    #---------------------------------
    this_par = JuMP.all_variables(mle)
    m_eval = JuMP.NLPEvaluator(mle)

    return JuMP.value.(β̃), JuMP.value.(σ̃), 1
end


function model_setup()
    X, y, β, σ, n = data_generate(1e4,5)
    β̃, σ̃, se_β̃ = estimate_mle_jumpjl(X, y, n)  # solve using JuMP
end

@time model_setup()

JuMP’s nonlinear interface is very restrictive. You may be better off with a different tool:


However, here’s the code you’re looking for:

    m_eval = NLPEvaluator(mle)
    MOI.initialize(m_eval, [:Hess])
    hess_struct = MOI.hessian_lagrangian_structure(m_eval)
    hess_values = zeros(length(hess_struct))
    x = value.(all_variables(mle))
    σ = 1.0
    μ = ones(length(m_eval.constraints))
    MOI.eval_hessian_lagrangian(m_eval, hess_values, x, σ, μ)
    H = sparse(
        map(i -> i[1], hess_struct), 
        map(i -> i[2], hess_struct),
        hess_values,
        length(x),
        length(x),
    )
    # H is probably, lower-triangular, convert as needed.

Documentation:


2 Likes

You can use NLPModelsJuMP, It works with JuMP / MOI and you can easily compute the gradient or the hessian of your model.

using JuMP
using Ipopt
using LinearAlgebra
using NLPModels
using NLPModelsJuMP
using Random

Random.seed!(1234)

function data_generate(N,T)
    # generate data for linear model to test optimization
    N = convert(Int64,N)
    T = convert(Int64,T)
    n = N*T
    #X = cat(ones(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), dims = 2)
    X = cat(ones(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), 5 .+ 3 * randn(n), rand(n), 2.5 .+ 2 * randn(n), 15 .+ 3 * randn(n), 0.7 .- 0.1 * randn(n), dims = 2)

    # population parameters
    β = [ 2.15; 0.10;  0.50; 0.10; 0.75; 1.2; 0.10;  0.50; 0.10; 0.75; 1.2; 0.10;
      0.50; 0.10; 0.75; 1.2 ]
    # population standard deviation
    σ = 0.3

    # generating the response variable
    draw = 0 .+ σ*randn(n)
    y = X*β + draw

    return X, y, β, σ, n
end

function mle_jumpjl(X, y, n)
    mle = Model(Ipopt.Optimizer)
    @variable(mle, β̃[i=1:size(X,2)], start = 1)
    @variable(mle, σ̃>=0, start = 1)
    @NLobjective(mle, Max, (n / 2) * log(1 / (2π * σ̃^2)) - sum((y[i] - sum(X[i,k]*β̃[k] for k in 1:size(X,2)))^2 for i ∈ 1:n ) / (2σ̃^2))
    return mle
end

function model_setup()
    X, y, β, σ, n = data_generate(1e4,5)
    mle = mle_jumpjl(X, y, n)
    return mle
end

mle = model_setup()
nlp = MathOptNLPModel(mle)
JuMP.optimize!(mle)
display(JuMP.objective_value(mle))
variables = JuMP.all_variables(mle)
sol = JuMP.value.(variables)
Hx = hess(nlp, sol)   # Symmetric(hess(nlp, sol), :L)
3 Likes

@odow, thanks again, and for always being very helpful. @amontoison thanks a bunch, did not know about NLPModels and NLPModelsJuMP. Both solutions are elegant. Unfortunately, I am unable to select both as solutions. So future readers, both work!

2 Likes