I am working on stochastic model predictive control in Julia/JuMP. I am using essentially a log-sum-exp function in the objective and encountered some issues with overflow. Now I am trying to implement a reformulation of the log-sum-exp function [1]:
I know JuMP is able to handle the max operation on x. But how is the sum without index k = argmax(x) implemented?
Any general pointers are greatly appreciated as well
[1] Pierre Blanchard, Desmond J Higham, and Nicholas J Higham. Accurately computing the log-sum-exp and softmax functions. IMA Journal of Numerical Analysis, 41(4):2311–2330, 08 2020
I assume you know this, but your expression is analytically equivalent to \log\left(\sum_i e^{x_i}\right), i.e., just factoring out the largest term. You could also write it as y = x_\mathrm{max} + \log\left(\sum_i e^{x_i-x_\mathrm{max}}\right) and not worry about any index shenanigans.
Thank you for the answer! I tried that package but it since I work in the JuMP framework, it does not accept JuMP variable types as arguments. Therefore I am going about this manually…
The analytical equivalent gives overflow issues and my solver (Ipopt) also struggles when disregarding the max index stuff in the sum.
I was actually eyeing Convex.jl for its explicit handling of the logsumexp.
However I work with an SMPC with nonlinear model constraints, which is a nonconvex problem. I still use a local interior point solver (Ipopt) as we assume our initial state x_0 at time of resolving is close enough to the optimum. I guess that is just common practice for real time optimization.
Would Convex.jl be applicable in this case?
I don’t understand why the equivalent form x_{\text{max}} + \log(\sum_i \exp(x_i - x_{\text{max}})) suggested by Tim is not suitable.
For example, the following fails:
using JuMP
using Ipopt
function logsumexp(x)
return log(sum(exp.(x)))
end
function main()
n = 10
model = Model(Ipopt.Optimizer)
@variable(model, x[1:n])
@constraint(model, x .≥ 0)
@constraint(model, x[1] ≤ x[2] - 1000)
@objective(model, Min, logsumexp(x))
optimize!(model)
xval = value.(x)
@info xval
end
But it works when I change logsumexp as follows:
function logsumexp(x)
xmax = maximum(x)
return xmax + log(sum(exp.(x .- xmax)))
end
What Convex.jl does is reformulate logsumexp as a conic programming problem, where you add a constraint that a particular expression belongs to an exponential cone. This is in some ways quite different to a rewriting of the formula, because with this approach, the solver does not execute the function you wrote, but rather receives an algebraic formulation of the problem (not e.g. a function pointer). To do a conic reformulation of logsumexp in JuMP, see: How to implment `logsumexp` function in JuMP? - #2 by odow
Your solver needs to be able to handle exponential cones for this approach to work (both for Convex.jl and for JuMP.jl). I’m not sure if many nonlinear solvers can handle conic constraints, and I don’t think Ipopt can. So this approach unfortunately probably is not suitable.
I don’t know Jump, but I would have presumed that you would want to avoid intermediate allocations for functions used repeatedly in an optimization scenario.
So, for example
function logsumexp(x)
xmax = maximum(x)
exp_ = v -> exp(v - xmax)
return xmax + log(sum(exp_, x))
end
or
function logsumexp(x)
xmax = maximum(x)
return xmax + log(sum(exp(v - xmax) for v in x))
end
AFAICT this does not matter to JuMP, because it does not actually call the function during optimization. It constructs a kind of symbolic representation internally, which is also why only a subset of operators are supported.
Your example works, thank you for clarifying! I think I’m getting an evaluation error from somewhere else since the solver still spits out warnings in my code