JuMP - Nested Optimization with user defined functions

Hi!

I have a Nested Optimization problem that I am trying to solve using JuMP. Inside, I have a linear problem with power constraints where I use JuMP + Mosek to solve. Outside, I have a concave problem defined on a set of linear constraints. The inner problem gives me both the value of the function conditional on the variable I am optimizing in the outer problem, and its gradients.

Ipopt seems like the natural optimizer for this problem. However, I have been using NLOpt because I cannot get the user-defined function to work.

This is not a MWE, but I am trying to do something like this:

using Ipopt, JuMP

@everywhere function f1_val(data::Data, x::Vector{Float64}, n::Int)
    Eπ, Eν = 0.0, zeros(data.I)
    for s in 1:data.Sexp
        temp = f2(data, x, s, n)
        Eπ += temp[1]
        Eν .+= temp[2]
    end
    return Eπ / data.Sexp - sum(data.α[:,n]  .*x)
end

@everywhere function f1_grad(data::Data, x::Vector{Float64}, n::Int)
    Eπ, Eν = 0.0, zeros(data.I)
    for s in 1:data.Sexp
        temp = f2(data, x, s, n)
        Eπ += temp[1]
        Eν .+= temp[2]
    end
    return  Eν ./ data.Sexp .- data.α[:,n] 
end

I woiuld like to find the vector x that maximizes f1_val.

I don’t fully understand your example because I don’t know what f2 ore the data are, but here’s an example of nested optimization problems via user-defined functions:

using JuMP
import Ipopt

function bilevel_problem()
    last_x, last_f, last_y = nothing, 0.0, [NaN, NaN]
    function _update(x...)
        if last_x == x
            return last_y
        end
        model = Model(Ipopt.Optimizer)
        set_silent(model)
        @variable(model, y[1:2])
        @NLobjective(
            model,
            Max,
            y[1] * x[1]^2 + y[2] * x[2]^2 - x[1] * y[1]^4 - 2 * x[2] * y[2]^4,
        )
        @constraint(model, (y[1] - 10)^2 + (y[2] - 10)^2 <= 25)
        optimize!(model)
        @assert termination_status(model) == LOCALLY_SOLVED
        last_x = x
        last_f = objective_value(model)
        last_y[1] = value(y[1])
        last_y[2] = value(y[2])
        return last_y
    end
    function f(x...)
        _update(x...)
        return last_f
    end
    function ∇f(g, x...)
        _update(x...)
        g[1] = 2 * x[1] * last_y[1] - last_y[1]^4
        g[2] = 2 * x[2] * last_y[2] - 2 * last_y[2]^4
        return
    end
    return _update, f, ∇f
end

function bench_bilevel_nlp()
    model = Model(Ipopt.Optimizer)
    set_optimizer_attribute(model, "tol", 1e-5)
    @variable(model, x[1:2] >= 0)
    subproblem, f, ∇f = bilevel_problem()
    register(model, :bilevel_f, 2, f, ∇f)
    @NLobjective(model, Min, bilevel_f(x[1], x[2]) + x[1]^2 + x[2]^2)
    optimize!(model)
    @assert termination_status(model) == LOCALLY_SOLVED
    println(objective_value(model))
    println("x = ", value.(x))
    println("y = ", subproblem(value.(x)...))
    return
end

Hi Odow,

I wonder how to register an inner JuMP optimization as the nested function and use the user-defined function in constraint or objective of the outer JuMP optimization. The motivation for doing this lies in the feature of general economic models: A hacky guide to using automatic differentiation in nested optimization problems - #6 by jeffreyesun.

I am working with a maximum likelihood case where the likelihood sums across M markets

\max_{\theta, \delta}\sum_{m=1}^M l(\theta,\delta_m)

\theta is a low dimensional vector with 5 to 10 parameters. \delta are market-m specific; each market \delta_m contains about 15 parameters that correspond to available products in the market m.

Writing the whole optimization in one JuMP instance with sparse index

function mle(data)
    model = Model(Ipopt.Optimizer)
    @variable(model, θ[1:5])
    @variable(model, δ[m in market_idset, j in product_market[m]])
    @variable(model, logli[m in market_idset])
    ...
    @NLobjective(model, Min, -sum(logli[m] for m in market_idset))
end

It works well when M is small. (less than 4 minutes with M=5 markets, 300,000 total variables, 16G MacBook Pro). However, my full sample has 300~400 markets and JuMP would never start the iteration. The total number of variables reaches at a scale of ten million in the full sample case. I am considering whether a nested optimization could improve the performance.

For each market, the log likelihood l(\delta_m,\theta) is convex in \delta_m. Fixing \theta, \max_{\delta_m}l(\delta_m,\theta) takes about 10 seconds to find the local optimal solution for \delta_m. I would like to rephrase the JuMP optimization in the light of the nested structure. An incomplete/incorrect JuMP I am trying to build from your example follows:

function outer_f(data)
    outer_model = Model(Ipopt.Optimizer)
    @variable(outer_model, θ[1:5])
    @variable(outer_model, logli[m in market_idset])
    @register(outer_model,:inner_f,5,inner_JuMP_delta)
    for m in mkt_idset
        @constraint(outer_model,logli[m] == inner_f(θ)[m][1])
    end
    @NLobjective(outer_model,Min,-sum(logli[m] for m in mkt_idset))
    optimize(outer_model)
end


function inner_JuMP_delta(para...,)
    result_obj = zeros(num_markets)
    result_delta = zeros(num_markets, num_products)
    for m in market_idset
        inner_model = Model(Ipopt.Optimizer)
        @variable(inner_model, δ in product_market[m])
        ...
        @NLobjective(log(...))
        obj_result[m] = objective_value(inner_model)
        result_prod[m,:] = value.(δ)
    end
    result, result_delta
end

The main issue is that how to register a rather complicated user-definition function involving nonlinear expression, data structure (like Dictionary I am heavily using) and also returning a large array. If JuMP couldn’t work, I am considering using Optim,jl in the outer optimization. I tried it with M=5 and the derivative-free approach, but it was significantly slower than the one JuMP optimization mle().

Thanks for any thought!

A couple of points:

  • User-defined functions must return scalar outputs.
  • You probably need to define a separate function for each market
  • You can’t automatically differentiate a JuMP model, so you’ll need to provide the explicit gradient of the loss function

I don’t have any data so I haven’t tested this, but hopefully it points you in the right direction. If you still have questions, make a new post with reproducible data.

There are various pointers in the documentation about this, take a read of:

function loss(m, θ...)
    model = Model(Ipopt.Optimizer)
    @variable(model, δ[product_market[m]])
    @NLobjective(model, Max, log(δ))
    optimize!(model)
    return objective_value(model)
end

function ∇loss(m, g, θ...)
    model = Model(Ipopt.Optimizer)
    @variable(model, δ[product_market[m]])
    @NLobjective(model, Max, log(δ))
    optimize!(model)
    g .= 0.0 # TODO
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, θ[1:5])
markets = [(θ...) -> loss(m, θ...) for m in markets_idset]
∇markets = [(g, θ...) -> ∇loss(m, g, θ...) for m in markets_idset]
expr = Expr(:call, :+)
for (i, (f, ∇f)) in enumerate(zip(markets, ∇markets))
    f_sym = Symbol("f_$(i)")
    register(model, f_sym, 1, f, ∇f)
    push!(expr.args, :($(f_sym)($(δ)...)))
end
set_nonlinear_objective(model, MOI.MIN_SENSE, Expr(:call, :-, expr))