Minimizing a logsumexp function with many terms (Convex.jl)

I am trying to minimize a function that depends on only few parameters, but contains a logsumexp with many terms in the sum, i.e. a function of the form

f(x) = \log \left(\sum_{i=1}^n e^{a_i \cdot x}\right)

where where a_i, x \in \mathbb{R}^m. In my case, the number of terms n is very large, but the number of parameters m is small. Typically, I might have n \sim 10^5 and m \sim 10.

Naively putting this function into Convex.jl will result in O(n) variables and constraints being passed to the solver, which I am hoping to avoid.

Is there any way of representing this function efficiently?

Have you tried JuMP?

Could you post a minimal example? (See also Please read: make it easier to help you). It looks like one might be able to avoid ~n variables/constraints but it would be easier to figure it out with some code to work with.

2 Likes

Thanks for your replies. Here is a minimal example for a moderate number of terms:

using Convex, SCS
A = randn(1000, 10)
x = Variable(10)
problem = minimize(Convex.logsumexp(A*x))
solve!(problem, SCS.Optimizer)

This example has 10 variables and 1000 terms, and looking at the output of SCS shows that 1012 variables and 3002 constraints are passed to the solver:

problem:  variables n: 1012, constraints m: 3002
cones:    z: primal zero / dual free vars: 1
          l: linear vars: 1
          e: exp vars: 3000, dual exp vars: 0
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, rho_x: 1.00e-06
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-direct-amd-qdldl
          nnz(A): 13002, nnz(P): 0

I found that problems with more than 10.000 terms easily consume >20GB of memory.

It is my understanding that JuMP would require me to formulate the problem in a solver-compatible way on my own, which I am not sure how to do (with less than O(n) variables).

I found a relevant section of the docs, haven’t tried it myself though:

https://jump.dev/JuMP.jl/stable/tutorials/conic/tips_and_tricks/#Log-sum-exp

The following seems to work but it has many variables still

using JuMP, SCS

N, M = 10, 1000
A = randn(M, N)

model = Model(SCS.Optimizer)

@variable(model, x[j = 1:N])
@variable(model, y[i = 1:M])
@variable(model, u[i = 1:M])
@variable(model, t)

@objective(model, Min, t)

@constraint(model, A * x .== y)
@constraint(model, sum(u) <= 1)
@constraint(model, [i = 1:M], [y[i] - t, 1, u[i]] in MOI.ExponentialCone())

optimize!(model)
value(t)
1 Like

Thanks a lot! I just tried your JuMP code with a larger number of terms and it seems to work. While this actually creates more variables and constraints than the Convex.jl version, the memory requirement is much, much lower (50000 terms seem to use ~1GB). So maybe a clever reformulation of the problem is actually not needed.

Maybe the excess memory requirement is an issue with Convex.jl (possibly related to this)?

2 Likes

More generally, Convex.jl is not actively maintained, whereas the JuMP people are extremely reactive, so betting on the latter is a good call for current projects

1 Like

You could try Convex#master. Convex has not been maintained much over the years but we merged a big refactor and Oscar (who does an amazing job maintaining tons of JuMP packages) also spent some time cleaning up Convex, fixing bugs and writing tests. Those changes haven’t made it to a release yet though.

3 Likes

The smaller JuMP formulation is:

using JuMP, SCS
N, M = 10, 1_000
A = randn(M, N)
model = Model(SCS.Optimizer)
@variable(model, x[1:N])
@variable(model, u[1:M])
@variable(model, t)
@objective(model, Min, t)
@constraint(model, sum(u) <= 1)
@constraint(model, [i in 1:M], [A[i, :]' * x - t, 1, u[i]] in MOI.ExponentialCone())
optimize!(model)
value(t)
1 Like

Do you think it changes anything in terms of performance?

Not really. It just avoids adding the y variable and constraints.