How to implment `logsumexp` function in JuMP?

Hi,
as I described my situation here, I’m trying to use DiffOpt.jl for my custom neural network training.

I realised that if I can use logsumexp function to express the objective function, I can make it.

However, I can’t find any logsumexp for JuMP.
For example, if I use Flux.logsumexp, it says something like this:

ERROR: MethodError: no method matching isless(::AffExpr, ::AffExpr)
Closest candidates are:
  isless(::Any, ::Missing) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/missing.jl:88
  isless(::Missing, ::Any) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/missing.jl:87
Stacktrace:
  [1] max(x::AffExpr, y::AffExpr)
    @ Base operators.jl:492
  [2] mapreduce_impl(f::typeof(identity), op::typeof(max), A::Matrix{AffExpr}, first::Int64, last::Int64)
    @ Base reduce.jl:635
  [3] _mapreduce(f::typeof(identity), op::typeof(max), #unused#::IndexLinear, A::Matrix{AffExpr})
    @ Base reduce.jl:417
  [4] _mapreduce_dim(f::typeof(identity), op::typeof(max), #unused#::Base._InitialValue, A::Matrix{AffExpr}, #unused#::Colon)
    @ Base reducedim.jl:330
  [5] mapreduce(f::typeof(identity), op::typeof(max), A::Matrix{AffExpr}; dims::Colon, init::Base._InitialValue)
    @ Base reducedim.jl:322
  [6] mapreduce(f::typeof(identity), op::typeof(max), A::Matrix{AffExpr})
    @ Base reducedim.jl:322
  [7] _maximum(f::typeof(identity), a::Matrix{AffExpr}, ::Colon; kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base reducedim.jl:894
  [8] _maximum(f::typeof(identity), a::Matrix{AffExpr}, ::Colon)
    @ Base reducedim.jl:894
  [9] _maximum(a::Matrix{AffExpr}, ::Colon; kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base reducedim.jl:893
 [10] _maximum(a::Matrix{AffExpr}, ::Colon)
    @ Base reducedim.jl:893
 [11] maximum(a::Matrix{AffExpr}; dims::Colon, kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base reducedim.jl:889
 [12] (::Base.var"#maximum##kw")(::NamedTuple{(:dims,), Tuple{Colon}}, ::typeof(maximum), a::Matrix{AffExpr})
    @ Base reducedim.jl:889
 [13] logsumexp(x::Matrix{AffExpr}; dims::Colon)
    @ NNlib ~/.julia/packages/NNlib/tvMmZ/src/softmax.jl:142
 [14] logsumexp(x::Matrix{AffExpr})
    @ NNlib ~/.julia/packages/NNlib/tvMmZ/src/softmax.jl:142
 [15] infer_test(nn::PLSE)
    @ Main ~/.julia/dev/ParametrisedConvexApproximators/test/tmp.jl:36

Is there any workaround for this?

You typically can’t use arbitrary nonlinear function in JuMP without registering them as user-defined functions: Nonlinear Modeling · JuMP.

Do you have a reproducible example of what you’re trying to achieve?

The Mosek modeling cookbook 5 Exponential cone optimization — MOSEK Modeling Cookbook 3.3.0 has a conic formulation of logsumexp:

N = 10
model = Model()
@variable(model, x[1:N])
@NLobjective(model, Min, log(sum(exp(x[i]) for i in 1:N)))
# becomes
model = Model()
@variable(model, x[1:N])
@variable(model, u[1:N])
@variable(model, t)
@constraint(model, sum(u) <= 1)
@constraint(model, [i=1:N], [x[i] - t, 1, u[i]] in MOI.ExponentialCone())
@objective(model, Min, t)
1 Like

An alternative would be to replace JuMP.jl with Convex.jl, which supports logsumexp (provided you use the right solvers)

I thought DiffOpt.jl does not support Convex.jl.
Is it possible? Do you have any MWE?

Could you use this example then?

using ParametrisedConvexApproximators
using JuMP
import DiffOpt
import SCS
import ChainRulesCore
import Flux
using Convex
using UnPack


function main()
    model = Model(() -> DiffOpt.diff_optimizer(SCS.Optimizer))
    empty!(model)
    set_silent(model)
    n, m, d = 3, 2, 1
    i_max = 20
    T = 1e-0
    h_array = [64]
    act = Flux.relu
    plse = PLSE(n, m, i_max, T, h_array, act)
    x = rand(n, d)
    @show plse(x, rand(m, d))
    u_star = infer_test(plse)
    return u_star
end

function infer_test(nn::PLSE)
    n, m, d = 3, 2, 1
    model = Model(() -> DiffOpt.diff_optimizer(SCS.Optimizer))
    @variable(model, u[1:m, 1:d])

    @unpack T = nn
    x = rand(n, d)
    tmp = ParametrisedConvexApproximators.affine_map(nn, x, u)
    # T * Flux.logsumexp((1/T)*tmp)
    @NLobjective(model, Min, T * log(sum(exp((1/T)*tmp[i]) for i in 1:nn.i_max)))
    optimize!(model)
    return value.(u)
end

And as your advice, I tried to convert it to an equivalent exponential cone program manually but it seems not to work and I’m kinda stuck with it

using ParametrisedConvexApproximators
using JuMP
import DiffOpt
import SCS
import ChainRulesCore
import Flux
using Convex
using UnPack


function main()
    model = Model(() -> DiffOpt.diff_optimizer(SCS.Optimizer))
    empty!(model)
    set_silent(model)
    n, m, d = 3, 2, 1
    i_max = 20
    T = 1e-0
    h_array = [64]
    act = Flux.relu
    plse = PLSE(n, m, i_max, T, h_array, act)
    x = rand(n, d)
    @show plse(x, rand(m, d))
    u_star = infer_test(plse)
    return u_star
end

function infer_test(nn::PLSE)
    n, m, d = 3, 2, 1
    model = Model(() -> DiffOpt.diff_optimizer(SCS.Optimizer))
    @variable(model, u[1:m])

    @unpack T = nn
    x = rand(n, d)
    tmp = ParametrisedConvexApproximators.affine_map(nn, x, u)
    # @NLobjective(model, Min, T * log(sum(exp((1/T)*tmp[i]) for i in 1:nn.i_max)))
    # equivalent form of logsumexp
    _tmp = (1/T) * tmp
    @variable(model, v[1:nn.i_max])
    @variable(model, t)
    @constraint(model, sum(v) <= 1)
    @constraint(model, [i=1:nn.i_max], [v[i], 1, _tmp[i, 1] - t] in MOI.ExponentialCone())
    @objective(model, Min, T*t)
    optimize!(model)
    return value.(u)
end

gives

julia> main()
plse(x, rand(m, d)) = [3.00289608300575;;]
ERROR: Constraints of type MathOptInterface.VectorAffineFunction{Float64}-in-MathOptInterface.ExponentialCone are not supported by the solver.

If you expected the solver to support your problem, you may have an error in your formulation. Otherwise, consider using a different solver.

The list of available solvers, along with the problem types they support, is available at https://jump.dev/JuMP.jl/stable/installation/#Supported-solvers.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] _moi_add_constraint(model::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{DiffOpt.Optimizer{MathOptInterface.Bridges.LazyBridgeOptimizer{MathOptInterface.Utilities.CachingOptimizer{SCS.Optimizer, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}}}}}, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}}, f::MathOptInterface.VectorAffineFunction{Float64}, s::MathOptInterface.ExponentialCone)
    @ JuMP ~/.julia/packages/JuMP/2IF9U/src/constraints.jl:534
  [3] add_constraint(model::Model, con::VectorConstraint{AffExpr, MathOptInterface.ExponentialCone, VectorShape}, name::String)
    @ JuMP ~/.julia/packages/JuMP/2IF9U/src/constraints.jl:560
  [4] macro expansion
    @ ~/.julia/packages/JuMP/2IF9U/src/macros.jl:814 [inlined]
  [5] (::var"#228#232"{VariableRef, Vector{VariableRef}, Matrix{AffExpr}, Model})(i::Int64)
    @ Main ~/.julia/dev/ParametrisedConvexApproximators/test/tmp.jl:27
  [6] #37
    @ ~/.julia/packages/JuMP/2IF9U/src/Containers/container.jl:72 [inlined]
  [7] iterate
    @ ./generator.jl:47 [inlined]
  [8] collect(itr::Base.Generator{JuMP.Containers.VectorizedProductIterator{Tuple{Base.OneTo{Int64}}}, JuMP.Containers.var"#37#38"{var"#228#232"{VariableRef, Vector{VariableRef}, Matrix{AffExpr}, Model}}})
    @ Base ./array.jl:724
  [9] map(f::Function, A::JuMP.Containers.VectorizedProductIterator{Tuple{Base.OneTo{Int64}}})
    @ Base ./abstractarray.jl:2896
 [10] container
    @ ~/.julia/packages/JuMP/2IF9U/src/Containers/container.jl:72 [inlined]
 [11] container
    @ ~/.julia/packages/JuMP/2IF9U/src/Containers/container.jl:66 [inlined]
 [12] infer_test(nn::PLSE)
    @ Main ~/.julia/dev/ParametrisedConvexApproximators/test/tmp.jl:41
 [13] main()
    @ Main ~/.julia/dev/ParametrisedConvexApproximators/test/tmp.jl:23
 [14] top-level scope
    @ REPL[10]:1
 [15] top-level scope
    @ ~/.julia/packages/CUDA/sCev8/src/initialization.jl:52

Ah. DiffOpt doesnt support exponential cone yet: Support exponential cone · Issue #50 · jump-dev/DiffOpt.jl · GitHub

Oh no… what a bad news…
It’s not a proper time to move to Julia for my research yet :frowning:
Thx a lot, @odow !