Add nested optimization problem in JuMP

leima0521 · August 30, 2023, 4:47pm

I want to include an inner loop optimization (a fixed point iteration algorithm) inside my current optimization problem. I cannot figure out how to make it work… Any help/guidance/advice is truly appreciated!

The current optimization code is ipopt_jump (thanks to the help of @odow in this post). It tries to find theta[1:K, 1:L] and delta[1:H] to maximize the objective function. The dimension of theta is relatively small, K=11, L=5, but the dimension of delta H is very big, H > 2000. Optimizing over >2000 parameters takes forever to run.

There is a trick that the value of delta can be calculated for any theta using a fixed point algorithm. Then I only need to maximize over K*L = 55 parameters instead of >2055 parameters.

delta can be expressed as a function of theta: \delta' = \delta + \ln s_h - \ln \hat{s}_h(\delta,\theta) . This will keep iterating until \delta' = \delta. \ln s_h is observed from the data, and \ln \hat{s}_h is \hat s_h(\delta,\theta) = \frac{1}{N} \sum_{i=1}^N \frac{\exp(\delta_h + X\theta Z)}{\sum_h \exp(\delta_h + X\theta Z)} .

I want to add an inner loop to solve for \delta(\theta) given \theta, then I only need to optimize over \theta. The outer loop is to maximize the objective function in ipopt_jump. I tried to incorporate the fixed point algorithm in ipopt_fp_jump, but it doesn’t work because of type incompatibility…

A data sample can be found here.

function ipopt_jump(data, iter=10000, tol=1e-3)
    X, Z, choice = data
    N, L = size(Z)
    H, K = size(X)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    set_attributes(model, "tol" => tol, "max_iter" => iter)
    @variable(model, theta[1:K, 1:L])
    @variable(model, delta[1:H])
    @expression(model, v, delta .+ (X * theta * Z'))
    @NLexpression(model, evsum[i = 1:N], sum(exp(v[h, i]) for h = 1:H) + 1)
    choice_h = sum(choice; dims = 1)
    @NLobjective(
        model,
        Max,
        sum(choice[h, i] * v[h, i] for h = 1:H, i = 1:N) - 
        sum(choice_h[i] * log(evsum[i]) for i = 1:N)
    )
    optimize!(model)
    return objective_value(model), value.(theta), value.(delta)
end

function ipopt_fp_jump(data, iter=10000, tol=1e-3)
    X, Z, choice, log_obs_s_h = data
    N, L = size(Z)
    H, K = size(X)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    set_attributes(model, "tol" => tol, "max_iter" => iter)

    @variable(model, theta[1:K, 1:L])


    # does not work..
    function calc_pred_s_h(θ, δ)
        @expression(model, v, δ .+ (X * θ * Z'))
        @NLexpression(model, evsum[i = 1:N], sum(exp(v[h, i]) for h = 1:H) + 1)
        @NLexpression(model, s_ih[h = 1:H, i = 1:N], exp(v[h, i]) / evsum[i])
        @NLexpression(model, s_h[h = 1:H], sum(s_ih[h, i] for i = 1:N)/N)  
        @NLexpression(model, ln_s_h[h = 1:H], log(s_h[h]))
        return ln_s_h
    end

    # same function as above: does not work..
    # function calc_s_h(θ, δ)
    #     ev = exp.(δ .+ (X * θ * Z'))
    #     evsum = sum(ev, dims = 1)
    #     s_ih = ev ./ evsum
    #     s_h = mean(s_ih, dims = 2)
    #     return log.(s_h)
    # end
    
    function fixed_point_algorithm(θ)
        old_delta = rand(H)
        normdiff = Inf

        while normdiff > tol
            new_delta = old_delta .+ log_obs_s_h .- calc_pred_s_h(θ, old_delta)
            normdiff = norm(new_delta - old_delta)
            old_delta = new_delta
        end
        
        return new_delta
    end

    delta = fixed_point_algorithm(theta)

    @expression(model, v, delta .+ (X * theta * Z'))
    @NLexpression(model, evsum[i = 1:N], sum(exp(v[h, i]) for h = 1:H) + 1)
    choice_h = sum(choice; dims = 1)

    @NLobjective(
        model,
        Max,
        sum(choice[h, i] * v[h, i] for h = 1:H, i = 1:N) - 
        sum(choice_h[i] * log(evsum[i]) for i = 1:N)
    )
    optimize!(model)
    return objective_value(model), value.(theta), value.(delta)
end

gdalle · August 30, 2023, 7:22pm

Why does it currently fail?

odow · August 30, 2023, 9:21pm

You cannot do what you’re trying to do in ipopt_fp_jump because you’re passing in theta::Vector{VariableRef}, which tries to compute the fixed point based on the JuMP variable object, not on the value of the variable in a solution.

The usual work-around would be to use a user-defined function, Nonlinear Modeling · JuMP, but these require scalar outputs.

Nothing immediately comes to mind for a good way to solve this problem with JuMP. Others might have a suggestion.

I’ll note that this problem is unconstrained, so you might have better luck trying something like Optim.jl. if you write things as a pure Julia function given theta (including the fixed point calculation), it might just work.

gdalle · August 31, 2023, 7:19am

Shameless plug: my package ImplicitDifferentiation.jl is very well-suited to computing gradients of fixed points. Combining it with a first order algorithm from Optim.jl or Optimization.jl for the outer loop might work

leima0521 · September 1, 2023, 2:09pm

Hi gdalle. It fails because ln_s_h is a general expression that cannot be directly used inside the fixed point function.

Topic		Replies	Views
Include fixed point in JuMP Optimization (Mathematical) jump	10	274	July 7, 2023
JuMP - Nested Optimization with user defined functions Optimization (Mathematical) jump	10	1456	March 6, 2023
How to modify JuMP model in-place to improve performance? Optimization (Mathematical) jump	14	880	May 12, 2022
Ipopt and external function Optimization (Mathematical) jump	11	1156	September 30, 2020
Help with nested optimization Optimization (Mathematical)	18	841	March 9, 2023

Add nested optimization problem in JuMP

Related topics