Solving a piecewise linear system

The help I need here is mostly conceptual, but I suspect that either JuMP.jl or Convex.jl could do this. Any suggestions or pointers to texts/tutorials would be appreciated, I just don’t know how to reformulate this.

I need to solve a system in n+1 unknowns s_0, s_1, \dots, s_n (the first one is handled specially), described as follows.

Let x^+ = \max(0, x) as usual, and apply elementwise for vectors. Let \mathbf{s} denote the vector [s_1, \dots, s_n] (note: s_0 is not in there).

Let a > 0, b > 0, c_0 > 0, \kappa > 0 be constants, and \mathbf{c} and \mathbf{f} n-element vectors. Then the system is

a s_0 = c_0 + \kappa \langle \mathbf{s}^+ , f\rangle

and for all i = 1, \dots, n,

a s_0 + b s_i = c_i + \langle (\mathbf{s} - s_i)^+, f \rangle

where \langle \cdot, \cdot \rangle is the dot product.

I read a suggestion that such problems can be cast as LP problems, but I have never encountered this before so I don’t know how to start.

How big is n?

1 write as min 0 under some constraints

2 rewrite the nonlinear operations as linear constraints, eg max(0,x) becomes y, with constraint y >= x, y >= 0

1 Like

At first glance this looks like an MILP. You could use https://github.com/rdeits/ConditionalJuMP.jl to set it up. There might be a way to formulate it as an LP, but I’m not seeing it straight away. What @antoine-levitt proposed constitutes a relaxation of the problem, I’d think?

1 Like

If the I stands for Integer in MILP, I don’t think it is one, all variables are real. What @antoine-levitt suggested may be the best solution; I was thinking of a Gauss-Seidel scheme but this should be better. n is around 10^3 to 10^5, more is always better but I can live with less if I need to.

Ah sorry of course what I said doesn’t work because y isn’t constrained to be as small as possible (I was thinking of a case where you minimize a max).

Edit : maybe minimize y then? But there must be a cleaner way

IIUC, @antoine-levitt 's solution would work if you use the Simplex method. Proof: We have n+1 original variables, and the same number of equality constraints, and we will introduce n^2 new variables and 2n^2 new inequality constraints. Since a basic feasible solution will have at least as many constraints active as the number of variables, all equality constraints will be active naturally and at least n^2 of the inequality constraints will be active. The latter means that the max relationship will hold for each variable.

The main problem here is that if n is large, you will have an LP with many constraints and variables, so it may be pretty hard to solve. 10^10 variables and constraints is a pretty large LP but maybe CPLEX can do it.

I tried setting up the relaxation proposed by @antoine-levitt and @mohamed82008 in JuMP and solving it using a simplex method, see below. But the relaxation really doesn’t appear to be tight.

using JuMP
using Random
using LinearAlgebra

# problem data
Random.seed!(1)
n = 10
a = rand()
b = rand()
κ = rand()
c₀ = rand()
c = randn(n)
f = randn(n)

using Gurobi
model = Model(solver=GurobiSolver(Method=0 #=Primal simplex=#))
# using Clp
# model = Model(solver=ClpSolver(SolveType=1 #=Primal simplex=#))
@variable model s₀
@variable model s[1 : n]
@variable model s⁺[1 : n] >= 0
@constraint model s⁺ .>= s
@variable model δ[1 : n, 1 : n] >= 0 # (s - s[i])⁺
for i = 1 : n
    @constraint model δ[:, i] .>= s - s[i]
end
@constraint model a * s₀ == c₀ + κ * s⁺ ⋅ f
for i = 1 : n
    @constraint model a * s₀ + b * s[i] == c[i] + δ[:, i] ⋅ f
end

solve(model)

using Test
@test getvalue.(s⁺) ≈ max.(getvalue.(s), 0) atol=1e-6 # passes
for i = 1 : n
    @test getvalue.(δ[:, i]) ≈ max.(getvalue.(s) .- getvalue(s[i]), 0) atol=1e-6 # fails
end

Am I doing something wrong?

Yes, integer. All variables in the original nonlinear program are real, but I was proposing to reformulate it as an MILP, with integer variables used to indicate which piece of each piecewise function you’re on.

Yes, my “solution” was very wrong. I think if you optimize over the sum of the extra variables it will work, but that looks like a hack, there should be a better way.

From what I understand (but I could be wrong) LP solvers are usually very well optimized, and it takes some effort to beat them, even if the LP reformulation is slightly suboptimal.

Adding

@objective model Min sum(s⁺) + sum(δ)

does not make the tests pass though.

Actually, https://github.com/joehuchette/PiecewiseLinearOpt.jl is probably a better fit.

Just to clarify, you are subtracting element from the vector in (\mathbf{s} − s_i). Did you mean elementwise difference? I.e. s .- s_i?

Ah yes, there’s no particular reason it should. That’ll teach me to actually think before typing. I’ll shut up now :slight_smile:

The following works for small n at least, if you’re OK with adding artificial bounds on s (needed for these reformulations to work):

using JuMP
using Random
using LinearAlgebra
using PiecewiseLinearOpt

# problem data
Random.seed!(1)
n = 5
a = rand()
b = rand()
κ = rand()
c₀ = rand()
c = randn(n)
f = randn(n)

# using Gurobi
# model = Model(solver=GurobiSolver())
using Cbc
model = Model(solver=CbcSolver())
@variable model s₀
@variable model s[1 : n]
s⁺ = piecewiselinear.(Ref(model), s, Ref([-10, 0, 10]), x -> max(0, x)) # 10 is arbitrary
δ = Matrix{Variable}(undef, n, n)
for i = 1 : n
    δ[:, i] = piecewiselinear.(Ref(model), s - s[i], Ref([-10, 0, 10]), x -> max(0, x)) # 10 is arbitrary
end
@constraint model a * s₀ == c₀ + κ * s⁺ ⋅ f
for i = 1 : n
    @constraint model a * s₀ + b * s[i] == c[i] + δ[:, i] ⋅ f
end

solve(model)

Edit: this requires the master branch of PiecewiseLinearOpt on Julia 1.0.

Here is my reformulation of your problem as a linear program:

1 Like

What values should be chosen for \alpha and \beta?

Actually, after thinking a bit more about this, I don’t think my recasting is completely equivalent.

Yes. Thanks for the clarification.

This is weird. Either my understanding is off, not sure how, or the solver is doing something funny behind the scene.

I have figured it out: it is trivial to show (eg proof by contradiction) that c_i < c_j \Rightarrow s_i < s_j, so we have the ordering and can just write the linear system because we know which s_k - s_j is positive, and solve the whole system given s_0, so it becomes a rootfinding problem in a single variable.

Which is great, because I need to solve it millions of times :wink:

Thanks for all the suggestions, I learned a lot from them.

3 Likes