Nested optimization - matrix multiplication and internal functions

I have a nested optimization problem in JuMP that fails to run because of Dual going through. In my original problem the lower level problem is a convex QP so it is possible to compute correct gradients but first the code needs to run. For simplicity I have set a scalar outer variable, but the problem fails in an even stranger way:
image

The problem for the MWE:
\min\limits_x 2x^2+0.5V(x) where x\geq 0 and V(x)=\max\limits_y \:y^T \begin{bmatrix} x&0\\0&x \end{bmatrix} y s.t. (y_1-1)^2+(y_2-1)^2\leq 2, y_i-x\leq 100, y_i-x\geq -100, i=1,2

Case 1: outer variable x is a vector. Then the code runs, but using internal functions fails and matrix multiplication fails:

Code case 1
using JuMP, Ipopt

function some_function(a,x...)
    return a*x
end

function solve_lower_level(x...)
    sm_fcn = x -> 1.0*x

    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])
    # @expression(model,testexp,[sm_fcn(x...)[1] 0.0; 0.0 x[2]]') ####Doesn't run
    # @expression(model,testexp,[some_function(1.0,x...) 0.0; 0.0 x[2]]') ####Doesn't run
    @expression(model,testexp,[1.0*x[1] 0.0; 0.0 x[2]]') ####Runs

    @objective(
        model,
        Max,
        y'*[1.0*x[1] 0.0; 0.0 x[2]]'*y
    )

    Cm = [1 0;0 -1]
    Dm = [100;100]

    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)
    @constraint(model, Cm[1,1]*(y[1]-x[1])<=Dm[1]) ######If  use this, the code runs
    @constraint(model, Cm[2,2]*(y[2]-x[2])<=Dm[2])

    # @constraint(model, Cm*(y-x)<=Dm) #####Supposed to be -100<=y[i]-x[i]<=100, i=1,2, but this doesn't work

    optimize!(model)
    @assert is_solved_and_feasible(model)


    # @show value.(y)
    return objective_value(model), value.(y)
end

function V(x...)
    f, _ = solve_lower_level(x...)
    return f
end

function ∇V(g::AbstractVector, x...)
    _, y = solve_lower_level(x...)
    g[1] = -y[1]^2
    g[2] = -y[1]^2
    return
end

function ∇²V(H::AbstractMatrix, x...)
    _, y = solve_lower_level(x...)
    H[1, 1] = 0.0
    H[2, 2] = 0.0
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, x[1:2] >= 0)
@operator(model, op_V, 2, V, ∇V, ∇²V)
@objective(model, Min, x[1]^2 + x[2]^2 + 0.5*op_V(x[1], x[2]))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

Case one: outer variable x is a scalar. Nothing runs, so I must be making a JuMP error:

Code case 2
using JuMP, Ipopt

function some_function(a,x)
    return a*x
end

function solve_lower_level(x)
    sm_fcn = x -> 1.0*x

    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])
    # @expression(model,testexp,[sm_fcn(x) 0.0; 0.0 x]') ####Doesn't run
    # @expression(model,testexp,[some_function(1.0,x) 0.0; 0.0 x]') ####Doesn't run
    # @expression(model,testexp,[1.0*x 0.0; 0.0 x]') ####Doesn't run
    # @expression(model,testexp,[x 0.0; 0.0 x]') ####Doesn't run

    @objective(
        model,
        Max,
        y[1]*y[1]+ y[2]*y[2]
    )

    Cm = [1 0;0 -1]
    Dm = [100;100]

    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)
    # @constraint(model, Cm[1,1]*(y[1].-x)<=Dm[1]) ######this doesn't work
    # @constraint(model, Cm[2,2]*(y[2].-x)<=Dm[2])

    # @constraint(model, Cm*(y-x)<=Dm) #####Supposed to be -100<=y[i]-x[i]<=100, i=1,2, but this doesn't work

    optimize!(model)
    @assert is_solved_and_feasible(model)

    # @show value.(y)
    return objective_value(model), value.(y)
end

function V(x...)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(g::AbstractVector, x)
    _, y = solve_lower_level(x)
    g[1] = -y[1]^2
    return
end

function ∇²V(H::AbstractMatrix, x...)
    _, y = solve_lower_level(x...)
    H[1, 1] = 0.0
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 1, V, ∇V, ∇²V)
@objective(model, Min, x^2 + x^2 + 0.5*op_V(x))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

Thanks for any inputs!

As usual, please give the complete error message, in a Markdown code block (with ```julia-repl), instead of a screenshot. See the PSA for more such advice:

The syntax for univariate functions is different to multivariate.

See Nonlinear Modeling · JuMP

Thanks! I still cannot figure it out even for a single variable. The code I have runs if I don’t use the upper level x in the lower level problem. If I use x in the lower level problem, for instance for multiplication, then I get an error.

using JuMP, Ipopt

function solve_lower_level(x)

    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])

    # @objective(
    #     model,
    #     Min,
    #     y[1]*y[1]+ y[2]*y[2]
    # ) ####This works

    @expression(model,testExp, y[1]*x*y[1]+ y[2]*x*y[2]) ###This doesn't

    @objective(
        model,
        Min,
        testExp
    )
    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)

    optimize!(model)
    @assert is_solved_and_feasible(model)

    return objective_value(model), value.(y)
end

function V(x)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(x)
    _, y = solve_lower_level(x)
    return -y[1]^2
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 1, V, ∇V)
@objective(model, Min, x^2 + x^2 + 0.5*op_V(x))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)

Error message
ERROR: Unable to register the function :op_V.

Common reasons for this include:

 * The function takes `f(x::Vector)` as input, instead of the splatted
   `f(x...)`.
 * The function assumes `Float64` will be passed as input, it must work for any
   generic `Real` type.
 * The function allocates temporary storage using `zeros(3)` or similar. This
   defaults to `Float64`, so use `zeros(T, 3)` instead.

As examples, instead of:
julia
my_function(x::Vector) = sum(x.^2)

use:
julia
my_function(x::T...) where {T<:Real} = sum(x[i]^2 for i in 1:length(x))


Instead of:
julia
function my_function(x::Float64...)
    y = zeros(length(x))
    for i in 1:length(x)
        y[i] = x[i]^2
    end
    return sum(y)
end

use:
julia
function my_function(x::T...) where {T<:Real}
    y = zeros(T, length(x))
    for i in 1:length(x)
        y[i] = x[i]^2
    end
    return sum(y)
end


Review the stacktrace below for more information, but it can often be hard to
understand why and where your function is failing. You can also debug this
outside of JuMP as follows:
julia
import ForwardDiff

# If the input dimension is 1
x = 1.0
my_function(a) = a^2
ForwardDiff.derivative(my_function, x)

# If the input dimension is more than 1
x = [1.0, 2.0]
my_function(a, b) = a^2 + b^2
ForwardDiff.gradient(x -> my_function(x...), x)


Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] _validate_register_assumptions(f::typeof(∇V), name::Symbol, dimension::Int64)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:327
  [3] MathOptInterface.Nonlinear._UnivariateOperator(op::Symbol, f::Function, f′::Function)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:347
  [4] register_operator(::MathOptInterface.Nonlinear.OperatorRegistry, ::Symbol, ::Int64, ::Function, ::Vararg{Function})
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:399
  [5] register_operator
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\model.jl:237 [inlined]
  [6] set(model::Ipopt.Optimizer, attr::MathOptInterface.UserDefinedFunction, args::Tuple{typeof(V), typeof(∇V)})
    @ Ipopt C:\.julia\packages\Ipopt\bqp63\src\MOI_wrapper.jl:490
  [7] set(b::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, attr::MathOptInterface.UserDefinedFunction, value::Tuple{typeof(V), typeof(∇V)})    
    @ MathOptInterface.Bridges C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:942
  [8] _pass_attribute(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap, attr::MathOptInterface.UserDefinedFunction)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:51
  [9] pass_attributes(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:38
 [10] default_copy_to(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:391
 [11] copy_to
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:442 [inlined]
 [12] optimize!
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\MathOptInterface.jl:121 [inlined]
 [13] optimize!(m::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\cachingoptimizer.jl:321
 [14] optimize!(model::Model; ignore_optimize_hook::Bool, _differentiation_backend::MathOptInterface.Nonlinear.SparseReverseMode, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:595
 [15] optimize!(model::Model)
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:546
 [16] top-level scope
    @ c:\Tests_diff3.jl:44

caused by: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})

Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat
   @ Base rounding.jl:207
  (::Type{T})(::T) where T<:Number
   @ Core boot.jl:792
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number}
   @ Base char.jl:50
  ...

Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ Base .\number.jl:7
  [2] _complex_convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\operators.jl:22
  [3] *(lhs::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1}, rhs::VariableRef)
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\operators.jl:48
  [4] *(lhs::VariableRef, rhs::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\operators.jl:105
  [5] operate(::typeof(*), ::VariableRef, ::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ MutableArithmetics C:\.julia\packages\MutableArithmetics\SXYDN\src\interface.jl:210
  [6] macro expansion
    @ C:\.julia\packages\MutableArithmetics\SXYDN\src\rewrite.jl:340 [inlined]
  [7] macro expansion
    @ C:\.julia\packages\JuMP\7rBNn\src\macros.jl:257 [inlined]
  [8] macro expansion
    @ C:\.julia\packages\JuMP\7rBNn\src\macros\@expression.jl:86 [inlined]
  [9] macro expansion
    @ C:\.julia\packages\JuMP\7rBNn\src\macros.jl:393 [inlined]
 [10] solve_lower_level(x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ Main c:\Tests_diff3.jl:14
 [11] ∇V(x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(∇V), Float64}, Float64, 1})
    @ Main c:\Tests_diff3.jl:36
 [12] derivative
    @ C:\.julia\packages\ForwardDiff\PcZ48\src\derivative.jl:14 [inlined]
 [13] _validate_register_assumptions(f::typeof(∇V), name::Symbol, dimension::Int64)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:321
 [14] MathOptInterface.Nonlinear._UnivariateOperator(op::Symbol, f::Function, f′::Function)
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:347
 [15] register_operator(::MathOptInterface.Nonlinear.OperatorRegistry, ::Symbol, ::Int64, ::Function, ::Vararg{Function})
    @ MathOptInterface.Nonlinear C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\operators.jl:399
 [16] register_operator
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Nonlinear\model.jl:237 [inlined]
 [17] set(model::Ipopt.Optimizer, attr::MathOptInterface.UserDefinedFunction, args::Tuple{typeof(V), typeof(∇V)})
    @ Ipopt C:\.julia\packages\Ipopt\bqp63\src\MOI_wrapper.jl:490
 [18] set(b::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, attr::MathOptInterface.UserDefinedFunction, value::Tuple{typeof(V), typeof(∇V)})    
    @ MathOptInterface.Bridges C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:942
 [19] _pass_attribute(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap, attr::MathOptInterface.UserDefinedFunction)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:51
 [20] pass_attributes(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}, index_map::MathOptInterface.Utilities.IndexMap)
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:38
 [21] default_copy_to(dest::MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, src::MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\copy.jl:391
 [22] copy_to
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\Bridges\bridge_optimizer.jl:442 [inlined]
 [23] optimize!
    @ C:\.julia\packages\MathOptInterface\aJZbq\src\MathOptInterface.jl:121 [inlined]
 [24] optimize!(m::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{Ipopt.Optimizer}, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}})
    @ MathOptInterface.Utilities C:\.julia\packages\MathOptInterface\aJZbq\src\Utilities\cachingoptimizer.jl:321
 [25] optimize!(model::Model; ignore_optimize_hook::Bool, _differentiation_backend::MathOptInterface.Nonlinear.SparseReverseMode, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:595
 [26] optimize!(model::Model)
    @ JuMP C:\.julia\packages\JuMP\7rBNn\src\optimizer_interface.jl:546
 [27] top-level scope
    @ c:\Tests_diff3.jl:44

Hmm. I think this is because you haven’t provided the second derivative, so we try to automatically complete it using FowardDiff.jl. I should clarify the documentation.

I think you need something like:

using JuMP, Ipopt

function solve_lower_level(x)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, y[1:2])
    @objective(model, Min, y[1] * x * y[1]+ y[2] * x * y[2])
    @constraint(model, (y[1] - 1)^2 + (y[2] - 1)^2 <= 2)
    optimize!(model)
    @assert is_solved_and_feasible(model)
    return objective_value(model), value.(y)
end

function V(x)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(x)
    _, y = solve_lower_level(x)
    return -(y[1]^2 + y[2]^2)
end

function ∇²V(x)
    _, y = solve_lower_level(x)
    return -2 * y[1] - 2 * y[2]
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 1, V, ∇V, ∇²V)
@objective(model, Min, x^2 + x^2 + 0.5 * op_V(x))
optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)
1 Like

Yes, that runs, thanks. The thing is that I don’t have second derivatives to my original problem so I thought I could get away without it :frowning:

For future reference, I found the documentation bit where it says that ForwardDiff cannot go through a JuMP problem on its own: Nonlinear Modeling · JuMP

1 Like

Perhaps you could try:

function V(x, tmp)
    f, _ = solve_lower_level(x)
    return f
end

function ∇V(g, x, tmp)
    _, y = solve_lower_level(x)
    g[1] = -(y[1]^2 + y[2]^2)
    g[2] = 0.0
    return
end

model = Model(Ipopt.Optimizer)
@variable(model, x >= 0)
@operator(model, op_V, 2, V, ∇V)
@objective(model, Min, x^2 + x^2 + 0.5 * op_V(x, 0))
1 Like

That works, thank you! Why it works, that’s a different question :sweat_smile: From all I understand, we shouldn’t even be able to pass two arguments to op_V without x....

Thanks again!

Why it works

We have a different implementation for uni-variate and multi-variate functions. This is confusing. We should probably be able to opt-in to the multivariate implementation even for N = 1.

Edit: I’ve opened an issue Univariate user-defined functions with the multivariate signature · Issue #2534 · jump-dev/MathOptInterface.jl · GitHub

we shouldn’t even be able to pass two arguments

This is just a regular Julia thing. These two are the same:

foo(x, y) = x[1] + x[2]
foo(x...) = x[1] + x[2]

Although the x... doesn’t give as nice an error if you call it with the wrong number of arguments.

1 Like