Solve bilevel optimization the stupid way

I would like to solve a bilevel optimization problem in a dumb way.
Consider the following (as an example)

\min_{x,y}\quad x^4 - \sin(y)\\ \text{s.t.}\quad x \geq 2, x^3+y^3 \leq 1\\ y(x) \in \arg\min_{y} y^2 + x\\ 3\leq y \leq 5

which is similar to the example on the doc page for BilevelJuMP about Non Linear Models. (I changed it slightly to match my use case that is lower-objective depending also on x)

This can be solved using BilevelJuMP.jl as explained on that page, with the following code:

using BilevelJuMP,Ipopt

model = BilevelModel(Ipopt.Optimizer, mode = BilevelJuMP.ProductMode(1e-5))

@variable(Upper(model), x >= 2)
@variable(Lower(model), 3 <= y <= 5)

# upper
@NLobjective(Upper(model), Min, x^4 - sin(y))
@NLconstraint(Upper(model), x^3 + y^3 <= 1000)

# lower
@objective(Lower(model), Min, y^2 + x)

optimize!(model)
value.([x,y])

(BilevelJuMP only accepts nonlinear objectives and constraints in the upper-level problem, while in the lower-level, it only accepts linear o quadratic objectives and constraints)

I would like to try to solve this problem using only JuMP in the dumbest possible way, that is: writing a function that for every value of x solves the lower level problem and returns the corresponding y, and then plug this into the upper level optimization.

Something on the line of this block of code:

# by hand
using JuMP,Ipopt

function lower(x)
    lowerModel = Model(Ipopt.Optimizer)
    set_silent(lowerModel)
    @variable(lowerModel,y)
    @constraint(lowerModel, y ≀ 5)
    @constraint(lowerModel, 3 ≀ y)
    @objective(lowerModel,Min,y^2 + x)
    optimize!(lowerModel)
    yVal = value(y)
    return yVal
end

upperModel = Model(Ipopt.Optimizer)
register(upperModel, :lower, 1, lower; autodiff = true)

@variable(upperModel, x)
@constraint(upperModel, x ≄ 2)
@NLconstraint(upperModel, x^3 + lower(x)^3 ≀ 1000)
@NLobjective(upperModel, Min, x^4 - sin(lower(x)))

optimize!(upperModel)
value(x)

However, this doesn’t work. I have never been able to register the function lower in the upper model, so I cannot proceed. It returns the error:

julia> register(upperModel, :lower, 1, lower; autodiff = true)
ERROR: Unable to register the function :lower.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] _validate_register_assumptions(f::typeof(lower), name::Symbol, dimension::Int64)
   @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/operators.jl:327
 [3] MathOptInterface.Nonlinear._UnivariateOperator(op::Symbol, f::Function)
   @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/operators.jl:340
 [4] register_operator(registry::MathOptInterface.Nonlinear.OperatorRegistry, op::Symbol, nargs::Int64, f::Function)
   @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/operators.jl:399
 [5] register_operator
   @ ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/model.jl:217 [inlined]
 [6] register(model::Model, op::Symbol, dimension::Int64, f::Function; autodiff::Bool)
   @ JuMP ~/.julia/packages/JuMP/jZvaU/src/nlp.jl:716
 [7] top-level scope
   @ REPL[135]:1

caused by: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:772
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
  ...
Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ Base ./number.jl:7
  [2] setproperty!(x::AffExpr, f::Symbol, v::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ Base ./Base.jl:39
  [3] add_to_expression!(aff::AffExpr, other::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ JuMP ~/.julia/packages/JuMP/jZvaU/src/aff_expr.jl:424
  [4] add_to_expression!(quad::QuadExpr, other::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ JuMP ~/.julia/packages/JuMP/jZvaU/src/quad_expr.jl:308
  [5] +(lhs::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1}, rhs::QuadExpr)
    @ JuMP ~/.julia/packages/JuMP/jZvaU/src/operators.jl:64
  [6] +(lhs::QuadExpr, rhs::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ JuMP ~/.julia/packages/JuMP/jZvaU/src/operators.jl:366
  [7] add_mul(a::QuadExpr, b::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/h0wjj/src/MutableArithmetics.jl:26
  [8] operate(::typeof(MutableArithmetics.add_mul), ::QuadExpr, ::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/h0wjj/src/interface.jl:206
  [9] operate_fallback!!(::MutableArithmetics.IsNotMutable, ::typeof(MutableArithmetics.add_mul), ::QuadExpr, ::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/h0wjj/src/interface.jl:586
 [10] operate!!(op::typeof(MutableArithmetics.add_mul), x::QuadExpr, args::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/h0wjj/src/rewrite.jl:89
 [11] macro expansion
    @ ~/.julia/packages/MutableArithmetics/h0wjj/src/rewrite.jl:322 [inlined]
 [12] macro expansion
    @ ~/.julia/packages/JuMP/jZvaU/src/macros.jl:1586 [inlined]
 [13] lower(x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(lower), Float64}, Float64, 1})
    @ Main ./REPL[133]:7
 [14] derivative
    @ ~/.julia/packages/ForwardDiff/PcZ48/src/derivative.jl:14 [inlined]
 [15] _validate_register_assumptions(f::typeof(lower), name::Symbol, dimension::Int64)
    @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/operators.jl:321
 [16] MathOptInterface.Nonlinear._UnivariateOperator(op::Symbol, f::Function)
    @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/operators.jl:340
 [17] register_operator(registry::MathOptInterface.Nonlinear.OperatorRegistry, op::Symbol, nargs::Int64, f::Function)
    @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/operators.jl:399
 [18] register_operator
    @ ~/.julia/packages/MathOptInterface/tWT4o/src/Nonlinear/model.jl:217 [inlined]
 [19] register(model::Model, op::Symbol, dimension::Int64, f::Function; autodiff::Bool)
    @ JuMP ~/.julia/packages/JuMP/jZvaU/src/nlp.jl:716
 [20] top-level scope
    @ REPL[135]:1

Is there a workaround to make it work?

I know this is idiotic. But in a case where the problem is nonlinear in the lower part, but in principle very easy it may be possible to solve it even in the dumb way.

Thanks for the help.

See this tutorial :smile:

3 Likes

Thank you for the reply. I already saw that example.
However, in that specific case, only the objective value from the inner problem is used (contrary to the actual value y(x) that realizes the objective) so that is quite easy to derive the gradient and hessian of the function for the inner problem explicitly (also because the inner constraint does not depend on x_1,x_2 on that page). This makes it possible to register and use the inner function without issues.

In my example, I would need to know, let’s say, the first derivative of y(x) w.r.t x.
I checked the page Nonlinear Modeling · JuMP to query derivatives from a jump model, but I didn’t manage to understand how to query derivatives of the optimal point, but only of the optimal value.
Is there a way to do it? I mean by using Julia not by doing it explicitly by hand, since in the real problem that cannot be done.

Yes, you would need an analytic derivative of the optimal solution of y with respect to x. This is problem-dependent, so it’s hard to give general advice. I don’t think there’s an automated way you could get this from JuMP.

Also, can’t your example just be written as a single minimization problem? Why solve it as a bilevel problem? In your current example, the gradient is 0 (y won’t change w.r.t. x).

1 Like

Of course :grinning:
However in general this is not the case.

I will have to think of some other strategy then to solve such a problem.
Thanks for all the help.

I suggest a possible, but not recommended workaround. Using FiniteDifferences.jl to compute approximate derivatives of y(x):

using JuMP,Ipopt
using FiniteDifferences

function lower(x)
    lowerModel = Model(Ipopt.Optimizer)
    set_silent(lowerModel)
    @variable(lowerModel,y)
    @constraint(lowerModel, y ≀ 5)
    @constraint(lowerModel, 3 ≀ y)
    @objective(lowerModel,Min, y^2 + x)
    optimize!(lowerModel)
    yVal = value(y)
    return yVal
end
∇lower(z) = central_fdm(5, 1; factor=1e10)(lower, z)
Hlower(z) = central_fdm(5, 2; factor=1e10)(lower, z)

upperModel = Model(Ipopt.Optimizer)
register(upperModel, :lower, 1, lower, ∇lower,Hlower)

@variable(upperModel, x)
@constraint(upperModel, x ≄ 2)
@NLconstraint(upperModel, x^3 + lower(x)^3 ≀ 1000)
@NLobjective(upperModel, Min, x^4 - sin(lower(x)))

optimize!(upperModel)
value(x)
lower(value(x))

In trivial this case, this works. In other cases may fail.

1 Like

I add a last piece of the story.
In the end, the “stupid” method didn’t work for my real case (although it did in the example above).

I solved it by doing by hand what BilevelJuMP does automatically. Since I was doing it by hand it was possible to have nonlinearity in the lower-level problem.
The reference paper for BilevelJuMP.jl is quite informative and well-written, and after reading it I was able to apply the same methodology to my use case.

That’s the way I would suggest anyone to proceed.

3 Likes