Nested optimization - matrix multiplication and internal functions

blob · August 12, 2024, 3:15pm

I have a nested optimization problem in JuMP that fails to run because of Dual going through. In my original problem the lower level problem is a convex QP so it is possible to compute correct gradients but first the code needs to run. For simplicity I have set a scalar outer variable, but the problem fails in an even stranger way:

The problem for the MWE:
\min\limits_x 2x^2+0.5V(x) where x\geq 0 and V(x)=\max\limits_y \:y^T \begin{bmatrix} x&0\\0&x \end{bmatrix} y s.t. (y_1-1)^2+(y_2-1)^2\leq 2, y_i-x\leq 100, y_i-x\geq -100, i=1,2

Case 1: outer variable x is a vector. Then the code runs, but using internal functions fails and matrix multiplication fails:

Code case 1

using JuMP, Ipopt

function some_function(a,x...)
    return a*x
end

function solve_lower_level(x...)
    sm_fcn = x -> 1.0*x

    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])
    # @expression(model,testexp,[sm_fcn(x...)[1] 0.0; 0.0 x[2]]') ####Doesn't run
    # @expression(model,testexp,[some_function(1.0,x...) 0.0; 0.0 x[2]]') ####Doesn't run
    @expression(model,testexp,[1.0*x[1] 0.0; 0.0 x[2]]') ####Runs

    @objective(
        model,
        Max,
        y'*[1.0*x[1] 0.0; 0.0 x[2]]'*y
    )

    Cm = [1 0;0 -1]
    Dm = [100;100]

    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)
    @constraint(model, Cm[1,1]*(y[1]-x[1])<=Dm[1]) ######If  use this, the code runs
    @constraint(model, Cm[2,2]*(y[2]-x[2])<=Dm[2])

    # @constraint(model, Cm*(y-x)<=Dm) #####Supposed to be -100<=y[i]-x[i]<=100, i=1,2, but this doesn't work

    optimize!(model)
    @assert is_solved_and_feasible(model)


    # @show value.(y)
    return objective_value(model), value.(y)
end

function V(x...)
    f, _ = solve_lower_level(x...)
    return f
end

function ∇V(g::AbstractVector, x...)
    _, y = solve_lower_level(x...)
    g[1] = -y[1]^2
    g[2] = -y[1]^2
    return
end

function ∇²V(H::AbstractMatrix, x...)
    _, y = solve_lower_level(x...)
    H[1, 1] = 0.0
    H[2, 2] = 0.0
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, x[1:2] >= 0)
@operator(model, op_V, 2, V, ∇V, ∇²V)
@objective(model, Min, x[1]^2 + x[2]^2 + 0.5*op_V(x[1], x[2]))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

Case one: outer variable x is a scalar. Nothing runs, so I must be making a JuMP error:

Code case 2

using JuMP, Ipopt

function some_function(a,x)
    return a*x
end

function solve_lower_level(x)
    sm_fcn = x -> 1.0*x

    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])
    # @expression(model,testexp,[sm_fcn(x) 0.0; 0.0 x]') ####Doesn't run
    # @expression(model,testexp,[some_function(1.0,x) 0.0; 0.0 x]') ####Doesn't run
    # @expression(model,testexp,[1.0*x 0.0; 0.0 x]') ####Doesn't run
    # @expression(model,testexp,[x 0.0; 0.0 x]') ####Doesn't run

    @objective(
        model,
        Max,
        y[1]*y[1]+ y[2]*y[2]
    )

    Cm = [1 0;0 -1]
    Dm = [100;100]

    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)
    # @constraint(model, Cm[1,1]*(y[1].-x)<=Dm[1]) ######this doesn't work
    # @constraint(model, Cm[2,2]*(y[2].-x)<=Dm[2])

    # @constraint(model, Cm*(y-x)<=Dm) #####Supposed to be -100<=y[i]-x[i]<=100, i=1,2, but this doesn't work

    optimize!(model)
    @assert is_solved_and_feasible(model)

    # @show value.(y)
    return objective_value(model), value.(y)
end

function V(x...)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(g::AbstractVector, x)
    _, y = solve_lower_level(x)
    g[1] = -y[1]^2
    return
end

function ∇²V(H::AbstractMatrix, x...)
    _, y = solve_lower_level(x...)
    H[1, 1] = 0.0
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 1, V, ∇V, ∇²V)
@objective(model, Min, x^2 + x^2 + 0.5*op_V(x))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

Thanks for any inputs!

nsajko · August 12, 2024, 3:30pm

As usual, please give the complete error message, in a Markdown code block (with ```julia-repl), instead of a screenshot. See the PSA for more such advice:

odow · August 12, 2024, 7:01pm

The syntax for univariate functions is different to multivariate.

See Nonlinear Modeling · JuMP

blob · August 13, 2024, 7:02am

Thanks! I still cannot figure it out even for a single variable. The code I have runs if I don’t use the upper level x in the lower level problem. If I use x in the lower level problem, for instance for multiplication, then I get an error.

using JuMP, Ipopt

function solve_lower_level(x)

    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])

    # @objective(
    #     model,
    #     Min,
    #     y[1]*y[1]+ y[2]*y[2]
    # ) ####This works

    @expression(model,testExp, y[1]*x*y[1]+ y[2]*x*y[2]) ###This doesn't

    @objective(
        model,
        Min,
        testExp
    )
    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)

    optimize!(model)
    @assert is_solved_and_feasible(model)

    return objective_value(model), value.(y)
end

function V(x)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(x)
    _, y = solve_lower_level(x)
    return -y[1]^2
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 1, V, ∇V)
@objective(model, Min, x^2 + x^2 + 0.5*op_V(x))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

Error message

ERROR: Unable to register the function :op_V.

Common reasons for this include:

 * The function takes `f(x::Vector)` as input, instead of the splatted
   `f(x...)`.
 * The function assumes `Float64` will be passed as input, it must work for any
   generic `Real` type.
 * The function allocates temporary storage using `zeros(3)` or similar. This
   defaults to `Float64`, so use `zeros(T, 3)` instead.

As examples, instead of:
julia
my_function(x::Vector) = sum(x.^2)

use:
julia
my_function(x::T...) where {T<:Real} = sum(x[i]^2 for i in 1:length(x))


Instead of:
julia
function my_function(x::Float64...)
    y = zeros(length(x))
    for i in 1:length(x)
        y[i] = x[i]^2
    end
    return sum(y)
end

use:
julia
function my_function(x::T...) where {T<:Real}
    y = zeros(T, length(x))
    for i in 1:length(x)
        y[i] = x[i]^2
    end
    return sum(y)
end


Review the stacktrace below for more information, but it can often be hard to
understand why and where your function is failing. You can also debug this
outside of JuMP as follows:
julia
import ForwardDiff

# If the input dimension is 1
x = 1.0
my_function(a) = a^2
ForwardDiff.derivative(my_function, x)

# If the input dimension is more than 1
x = [1.0, 2.0]
my_function(a, b) = a^2 + b^2
ForwardDiff.gradient(x -> my_function(x...), x)


Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] _validate_register_assumptions(f::typeof(∇V), name::Symbol, dimension::Int64)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:327
  [3] MathOptInterface.Nonlinear._UnivariateOperator(op::Symbol, f::Function, f′::Function)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:347
  [4] register_operator(::MathOptInterface.Nonlinear.OperatorRegistry, ::Symbol, ::Int64, ::Function, ::Vararg{Function})
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:399
  [5] register_operator
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\model.jl:237 [inlined]
  [6] set(model::Ipopt.Optimizer, attr::MathOptInterface.UserDefinedFunction, args::Tuple{typeof(V), typeof(∇V)})
    @ Ipopt C:\.julia\packages\Ipopt\bqp63\src\MOI_wrapper.jl:490
  [7] set(b::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, attr::MathOptInterface.UserDefinedFunction, value::Tuple{typeof(V), typeof(∇V)})    
    @ MathOptInterface.Bridges C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:942
  [8] _pass_attribute(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap, attr::MathOptInterface.UserDefinedFunction)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:51
  [9] pass_attributes(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:38
 [10] default_copy_to(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:391
 [11] copy_to
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:442 [inlined]
 [12] optimize!
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\MathOptInterface.jl:121 [inlined]
 [13] optimize!(m::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\cachingoptimizer.jl:321
 [14] optimize!(model::Model; ignore_optimize_hook::Bool, _differentiation_backend::MathOptInterface.Nonlinear.SparseReverseMode, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:595
 [15] optimize!(model::Model)
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:546
 [16] top-level scope
    @ c:\Tests_diff3.jl:44

caused by: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})

Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat
   @ Base rounding.jl:207
  (::Type{T})(::T) where T<:Number
   @ Core boot.jl:792
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number}
   @ Base char.jl:50
  ...

Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ Base .\number.jl:7
  [2] _complex_convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\operators.jl:22
  [3] *(lhs::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1}, rhs::VariableRef)
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\operators.jl:48
  [4] *(lhs::VariableRef, rhs::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\operators.jl:105
  [5] operate(::typeof(*), ::VariableRef, ::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ MutableArithmetics C:\.julia\packages\MutableArithmetics\SXYDN\src\interface.jl:210
  [6] macro expansion
    @ C:\.julia\packages\MutableArithmetics\SXYDN\src\rewrite.jl:340 [inlined]
  [7] macro expansion
    @ C:\.julia\packages\JuMP\7rBNn\src\macros.jl:257 [inlined]
  [8] macro expansion
    @ C:\.julia\packages\JuMP\7rBNn\src\macros\@expression.jl:86 [inlined]
  [9] macro expansion
    @ C:\.julia\packages\JuMP\7rBNn\src\macros.jl:393 [inlined]
 [10] solve_lower_level(x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ Main c:\Tests_diff3.jl:14
 [11] ∇V(x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ Main c:\Tests_diff3.jl:36
 [12] derivative
    @ C:\.julia\packages\ForwardDiff\PcZ48\src\derivative.jl:14 [inlined]
 [13] _validate_register_assumptions(f::typeof(∇V), name::Symbol, dimension::Int64)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:321
 [14] MathOptInterface.Nonlinear._UnivariateOperator(op::Symbol, f::Function, f′::Function)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:347
 [15] register_operator(::MathOptInterface.Nonlinear.OperatorRegistry, ::Symbol, ::Int64, ::Function, ::Vararg{Function})
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:399
 [16] register_operator
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\model.jl:237 [inlined]
 [17] set(model::Ipopt.Optimizer, attr::MathOptInterface.UserDefinedFunction, args::Tuple{typeof(V), typeof(∇V)})
    @ Ipopt C:\.julia\packages\Ipopt\bqp63\src\MOI_wrapper.jl:490
 [18] set(b::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, attr::MathOptInterface.UserDefinedFunction, value::Tuple{typeof(V), typeof(∇V)})    
    @ MathOptInterface.Bridges C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:942
 [19] _pass_attribute(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap, attr::MathOptInterface.UserDefinedFunction)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:51
 [20] pass_attributes(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:38
 [21] default_copy_to(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:391
 [22] copy_to
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:442 [inlined]
 [23] optimize!
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\MathOptInterface.jl:121 [inlined]
 [24] optimize!(m::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\cachingoptimizer.jl:321
 [25] optimize!(model::Model; ignore_optimize_hook::Bool, _differentiation_backend::MathOptInterface.Nonlinear.SparseReverseMode, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:595
 [26] optimize!(model::Model)
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:546
 [27] top-level scope
    @ c:\Tests_diff3.jl:44

odow · August 13, 2024, 7:19am

Hmm. I think this is because you haven’t provided the second derivative, so we try to automatically complete it using FowardDiff.jl. I should clarify the documentation.

I think you need something like:

using JuMP, Ipopt

function solve_lower_level(x)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])
    @objective(model, Min, y[1] * x * y[1]+ y[2] * x * y[2])
    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)
    optimize!(model)
    @assert is_solved_and_feasible(model)
    return objective_value(model), value.(y)
end

function V(x)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(x)
    _, y = solve_lower_level(x)
    return -(y[1]^2 + y[2]^2)
end

function ∇²V(x)
    _, y = solve_lower_level(x)
    return -2 * y[1] - 2 * y[2]
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 1, V, ∇V, ∇²V)
@objective(model, Min, x^2 + x^2 + 0.5 * op_V(x))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

blob · August 13, 2024, 7:29am

Yes, that runs, thanks. The thing is that I don’t have second derivatives to my original problem so I thought I could get away without it

blob · August 13, 2024, 8:04am

For future reference, I found the documentation bit where it says that ForwardDiff cannot go through a JuMP problem on its own: Nonlinear Modeling · JuMP

odow · August 13, 2024, 9:04am

Perhaps you could try:

function V(x, tmp)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(g, x, tmp)
    _, y = solve_lower_level(x)
    g[1] = -(y[1]^2 + y[2]^2)
    g[2] = 0.0
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 2, V, ∇V)
@objective(model, Min, x^2 + x^2 + 0.5 * op_V(x, 0))

blob · August 13, 2024, 9:23am

That works, thank you! Why it works, that’s a different question From all I understand, we shouldn’t even be able to pass two arguments to op_V without x....

Thanks again!

odow · August 13, 2024, 9:25am

Why it works

We have a different implementation for uni-variate and multi-variate functions. This is confusing. We should probably be able to opt-in to the multivariate implementation even for N = 1.

Edit: I’ve opened an issue Univariate user-defined functions with the multivariate signature · Issue #2534 · jump-dev/MathOptInterface.jl · GitHub

we shouldn’t even be able to pass two arguments

This is just a regular Julia thing. These two are the same:

foo(x, y) = x[1] + x[2]
foo(x...) = x[1] + x[2]

Although the x... doesn’t give as nice an error if you call it with the wrong number of arguments.

Topic		Replies	Views
JuMP - Nested Optimization with user defined functions Optimization (Mathematical) jump	10	1414	March 6, 2023
Nested optimizations, vars traverse nesting Optimization (Mathematical) jump	7	310	January 12, 2023
Add nested optimization problem in JuMP Optimization (Mathematical) jump	4	327	September 1, 2023
Nested optimisation problem in JuMP Optimization (Mathematical) jump	6	617	January 12, 2024
Using `bboptimize` with a function containing `JuMP` calls Optimization (Mathematical) jump , optimization	2	162	April 14, 2024

Nested optimization - matrix multiplication and internal functions

Related topics