# Add nested optimization problem in JuMP

I want to include an inner loop optimization (a fixed point iteration algorithm) inside my current optimization problem. I cannot figure out how to make it work… Any help/guidance/advice is truly appreciated!

The current optimization code is ipopt_jump (thanks to the help of @odow in this post). It tries to find theta[1:K, 1:L] and delta[1:H] to maximize the objective function. The dimension of theta is relatively small, K=11, L=5, but the dimension of delta H is very big, H > 2000. Optimizing over >2000 parameters takes forever to run.

There is a trick that the value of delta can be calculated for any theta using a fixed point algorithm. Then I only need to maximize over K*L = 55 parameters instead of >2055 parameters.

delta can be expressed as a function of theta: \delta' = \delta + \ln s_h - \ln \hat{s}_h(\delta,\theta) . This will keep iterating until \delta' = \delta. \ln s_h is observed from the data, and \ln \hat{s}_h is \hat s_h(\delta,\theta) = \frac{1}{N} \sum_{i=1}^N \frac{\exp(\delta_h + X\theta Z)}{\sum_h \exp(\delta_h + X\theta Z)} .

I want to add an inner loop to solve for \delta(\theta) given \theta, then I only need to optimize over \theta. The outer loop is to maximize the objective function in ipopt_jump. I tried to incorporate the fixed point algorithm in ipopt_fp_jump, but it doesn’t work because of type incompatibility…

A data sample can be found here.

function ipopt_jump(data, iter=10000, tol=1e-3)
X, Z, choice = data
N, L = size(Z)
H, K = size(X)
model = Model(Ipopt.Optimizer)
set_silent(model)
set_attributes(model, "tol" => tol, "max_iter" => iter)
@variable(model, theta[1:K, 1:L])
@variable(model, delta[1:H])
@expression(model, v, delta .+ (X * theta * Z'))
@NLexpression(model, evsum[i = 1:N], sum(exp(v[h, i]) for h = 1:H) + 1)
choice_h = sum(choice; dims = 1)
@NLobjective(
model,
Max,
sum(choice[h, i] * v[h, i] for h = 1:H, i = 1:N) -
sum(choice_h[i] * log(evsum[i]) for i = 1:N)
)
optimize!(model)
return objective_value(model), value.(theta), value.(delta)
end

function ipopt_fp_jump(data, iter=10000, tol=1e-3)
X, Z, choice, log_obs_s_h = data
N, L = size(Z)
H, K = size(X)
model = Model(Ipopt.Optimizer)
set_silent(model)
set_attributes(model, "tol" => tol, "max_iter" => iter)

@variable(model, theta[1:K, 1:L])

# does not work..
function calc_pred_s_h(θ, δ)
@expression(model, v, δ .+ (X * θ * Z'))
@NLexpression(model, evsum[i = 1:N], sum(exp(v[h, i]) for h = 1:H) + 1)
@NLexpression(model, s_ih[h = 1:H, i = 1:N], exp(v[h, i]) / evsum[i])
@NLexpression(model, s_h[h = 1:H], sum(s_ih[h, i] for i = 1:N)/N)
@NLexpression(model, ln_s_h[h = 1:H], log(s_h[h]))
return ln_s_h
end

# same function as above: does not work..
# function calc_s_h(θ, δ)
#     ev = exp.(δ .+ (X * θ * Z'))
#     evsum = sum(ev, dims = 1)
#     s_ih = ev ./ evsum
#     s_h = mean(s_ih, dims = 2)
#     return log.(s_h)
# end

function fixed_point_algorithm(θ)
old_delta = rand(H)
normdiff = Inf

while normdiff > tol
new_delta = old_delta .+ log_obs_s_h .- calc_pred_s_h(θ, old_delta)
normdiff = norm(new_delta - old_delta)
old_delta = new_delta
end

return new_delta
end

delta = fixed_point_algorithm(theta)

@expression(model, v, delta .+ (X * theta * Z'))
@NLexpression(model, evsum[i = 1:N], sum(exp(v[h, i]) for h = 1:H) + 1)
choice_h = sum(choice; dims = 1)

@NLobjective(
model,
Max,
sum(choice[h, i] * v[h, i] for h = 1:H, i = 1:N) -
sum(choice_h[i] * log(evsum[i]) for i = 1:N)
)
optimize!(model)
return objective_value(model), value.(theta), value.(delta)
end


Why does it currently fail?

You cannot do what you’re trying to do in ipopt_fp_jump because you’re passing in theta::Vector{VariableRef}, which tries to compute the fixed point based on the JuMP variable object, not on the value of the variable in a solution.

The usual work-around would be to use a user-defined function, Nonlinear Modeling · JuMP, but these require scalar outputs.

Nothing immediately comes to mind for a good way to solve this problem with JuMP. Others might have a suggestion.

I’ll note that this problem is unconstrained, so you might have better luck trying something like Optim.jl. if you write things as a pure Julia function given theta (including the fixed point calculation), it might just work.

1 Like

Shameless plug: my package ImplicitDifferentiation.jl is very well-suited to computing gradients of fixed points. Combining it with a first order algorithm from Optim.jl or Optimization.jl for the outer loop might work

1 Like

Hi gdalle. It fails because ln_s_h is a general expression that cannot be directly used inside the fixed point function.