Implementing the log-sum-exp function

kangqiu · November 22, 2024, 10:31am

Hi everyone!

I am working on stochastic model predictive control in Julia/JuMP. I am using essentially a log-sum-exp function in the objective and encountered some issues with overflow. Now I am trying to implement a reformulation of the log-sum-exp function [1]:

I know JuMP is able to handle the max operation on x. But how is the sum without index k = argmax(x) implemented?

Any general pointers are greatly appreciated as well

[1] Pierre Blanchard, Desmond J Higham, and Nicholas J Higham. Accurately computing the log-sum-exp and softmax functions. IMA Journal of Numerical Analysis, 41(4):2311–2330, 08 2020

tim.holy · November 22, 2024, 10:35am

It’s already implemented here: GitHub - JuliaStats/LogExpFunctions.jl: Julia package for various special functions based on `log` and `exp`.

I assume you know this, but your expression is analytically equivalent to \log\left(\sum_i e^{x_i}\right), i.e., just factoring out the largest term. You could also write it as y = x_\mathrm{max} + \log\left(\sum_i e^{x_i-x_\mathrm{max}}\right) and not worry about any index shenanigans.

gdalle · November 22, 2024, 10:38am

You might be interested in Convex.jl, whose disciplined convex programming reformulation can handle logsumexp provided you use an adequate solver.

kangqiu · November 22, 2024, 10:48am

Thank you for the answer! I tried that package but it since I work in the JuMP framework, it does not accept JuMP variable types as arguments. Therefore I am going about this manually…

The analytical equivalent gives overflow issues and my solver (Ipopt) also struggles when disregarding the max index stuff in the sum.

kangqiu · November 22, 2024, 10:53am

I was actually eyeing Convex.jl for its explicit handling of the logsumexp.

However I work with an SMPC with nonlinear model constraints, which is a nonconvex problem. I still use a local interior point solver (Ipopt) as we assume our initial state x_0 at time of resolving is close enough to the optimum. I guess that is just common practice for real time optimization.
Would Convex.jl be applicable in this case?

barucden · November 22, 2024, 12:11pm

I don’t understand why the equivalent form x_{\text{max}} + \log(\sum_i \exp(x_i - x_{\text{max}})) suggested by Tim is not suitable.

For example, the following fails:

using JuMP
using Ipopt

function logsumexp(x)
    return log(sum(exp.(x)))
end

function main()
    n = 10
    model = Model(Ipopt.Optimizer)
    @variable(model, x[1:n])
    @constraint(model, x .≥ 0)
    @constraint(model, x[1] ≤ x[2] - 1000)
    @objective(model, Min, logsumexp(x))

    optimize!(model)
    xval = value.(x)
    @info xval
end

But it works when I change logsumexp as follows:

function logsumexp(x)
    xmax = maximum(x)
    return xmax + log(sum(exp.(x .- xmax)))
end

gdalle · November 22, 2024, 1:08pm

I don’t think so. Convex.jl must certify that the problem is convex, which it may even fail to do for some convex problems.

Maybe because logsumexp is not one of the supported nonlinear operators? You may want to look into user-defined operators.

ericphanson · November 22, 2024, 3:28pm

What Convex.jl does is reformulate logsumexp as a conic programming problem, where you add a constraint that a particular expression belongs to an exponential cone. This is in some ways quite different to a rewriting of the formula, because with this approach, the solver does not execute the function you wrote, but rather receives an algebraic formulation of the problem (not e.g. a function pointer). To do a conic reformulation of logsumexp in JuMP, see: How to implment `logsumexp` function in JuMP? - #2 by odow

Your solver needs to be able to handle exponential cones for this approach to work (both for Convex.jl and for JuMP.jl). I’m not sure if many nonlinear solvers can handle conic constraints, and I don’t think Ipopt can. So this approach unfortunately probably is not suitable.

DNF · November 22, 2024, 3:35pm

I don’t know Jump, but I would have presumed that you would want to avoid intermediate allocations for functions used repeatedly in an optimization scenario.

So, for example

function logsumexp(x)
    xmax = maximum(x) 
    exp_ = v -> exp(v - xmax) 
    return xmax + log(sum(exp_, x)) 
end

or

function logsumexp(x)
    xmax = maximum(x) 
    return xmax + log(sum(exp(v - xmax) for v in x)) 
end

gdalle · November 22, 2024, 3:41pm

AFAICT this does not matter to JuMP, because it does not actually call the function during optimization. It constructs a kind of symbolic representation internally, which is also why only a subset of operators are supported.

kangqiu · November 23, 2024, 9:23am

Your example works, thank you for clarifying! I think I’m getting an evaluation error from somewhere else since the solver still spits out warnings in my code

Topic		Replies	Views
How to implment `logsumexp` function in JuMP? Optimization (Mathematical) jump , differentiation , autodiff	7	744	July 18, 2022
Mixing models between JuMP.jl and Convex.jl Optimization (Mathematical)	2	1788	May 10, 2017
Maximize sum of the piecewise function Optimization (Mathematical) jump	2	330	October 18, 2022
Jump using `max(x...)` where x is a matrix? Optimization (Mathematical) question , jump	6	407	March 26, 2022
Summing exponential expressions in NLP Optimization (Mathematical)	1	230	August 9, 2023

Implementing the log-sum-exp function

Related topics