Solving a piecewise linear system

question

#1

The help I need here is mostly conceptual, but I suspect that either JuMP.jl or Convex.jl could do this. Any suggestions or pointers to texts/tutorials would be appreciated, I just don’t know how to reformulate this.

I need to solve a system in n+1 unknowns s_0, s_1, \dots, s_n (the first one is handled specially), described as follows.

Let x^+ = \max(0, x) as usual, and apply elementwise for vectors. Let \mathbf{s} denote the vector [s_1, \dots, s_n] (note: s_0 is not in there).

Let a > 0, b > 0, c_0 > 0, \kappa > 0 be constants, and \mathbf{c} and \mathbf{f} n-element vectors. Then the system is

a s_0 = c_0 + \kappa \langle \mathbf{s}^+ , f\rangle

and for all i = 1, \dots, n,

a s_0 + b s_i = c_i + \langle (\mathbf{s} - s_i)^+, f \rangle

where \langle \cdot, \cdot \rangle is the dot product.

I read a suggestion that such problems can be cast as LP problems, but I have never encountered this before so I don’t know how to start.


#2

How big is n?


#3

1 write as min 0 under some constraints

2 rewrite the nonlinear operations as linear constraints, eg max(0,x) becomes y, with constraint y >= x, y >= 0


#4

At first glance this looks like an MILP. You could use https://github.com/rdeits/ConditionalJuMP.jl to set it up. There might be a way to formulate it as an LP, but I’m not seeing it straight away. What @antoine-levitt proposed constitutes a relaxation of the problem, I’d think?


#5

If the I stands for Integer in MILP, I don’t think it is one, all variables are real. What @antoine-levitt suggested may be the best solution; I was thinking of a Gauss-Seidel scheme but this should be better. n is around 10^3 to 10^5, more is always better but I can live with less if I need to.


#6

Ah sorry of course what I said doesn’t work because y isn’t constrained to be as small as possible (I was thinking of a case where you minimize a max).

Edit : maybe minimize y then? But there must be a cleaner way


#7

IIUC, @antoine-levitt 's solution would work if you use the Simplex method. Proof: We have n+1 original variables, and the same number of equality constraints, and we will introduce n^2 new variables and 2n^2 new inequality constraints. Since a basic feasible solution will have at least as many constraints active as the number of variables, all equality constraints will be active naturally and at least n^2 of the inequality constraints will be active. The latter means that the max relationship will hold for each variable.

The main problem here is that if n is large, you will have an LP with many constraints and variables, so it may be pretty hard to solve. 10^10 variables and constraints is a pretty large LP but maybe CPLEX can do it.


#8

I tried setting up the relaxation proposed by @antoine-levitt and @mohamed82008 in JuMP and solving it using a simplex method, see below. But the relaxation really doesn’t appear to be tight.

using JuMP
using Random
using LinearAlgebra

# problem data
Random.seed!(1)
n = 10
a = rand()
b = rand()
κ = rand()
c₀ = rand()
c = randn(n)
f = randn(n)

using Gurobi
model = Model(solver=GurobiSolver(Method=0 #=Primal simplex=#))
# using Clp
# model = Model(solver=ClpSolver(SolveType=1 #=Primal simplex=#))
@variable model s₀
@variable model s[1 : n]
@variable model s⁺[1 : n] >= 0
@constraint model s⁺ .>= s
@variable model δ[1 : n, 1 : n] >= 0 # (s - s[i])⁺
for i = 1 : n
    @constraint model δ[:, i] .>= s - s[i]
end
@constraint model a * s₀ == c₀ + κ * s⁺ ⋅ f
for i = 1 : n
    @constraint model a * s₀ + b * s[i] == c[i] + δ[:, i] ⋅ f
end

solve(model)

using Test
@test getvalue.(s⁺) ≈ max.(getvalue.(s), 0) atol=1e-6 # passes
for i = 1 : n
    @test getvalue.(δ[:, i]) ≈ max.(getvalue.(s) .- getvalue(s[i]), 0) atol=1e-6 # fails
end

Am I doing something wrong?

Yes, integer. All variables in the original nonlinear program are real, but I was proposing to reformulate it as an MILP, with integer variables used to indicate which piece of each piecewise function you’re on.


#9

Yes, my “solution” was very wrong. I think if you optimize over the sum of the extra variables it will work, but that looks like a hack, there should be a better way.

From what I understand (but I could be wrong) LP solvers are usually very well optimized, and it takes some effort to beat them, even if the LP reformulation is slightly suboptimal.


#10

Adding

@objective model Min sum(s⁺) + sum(δ)

does not make the tests pass though.


#11

Actually, https://github.com/joehuchette/PiecewiseLinearOpt.jl is probably a better fit.


#12

Just to clarify, you are subtracting element from the vector in (\mathbf{s} − s_i). Did you mean elementwise difference? I.e. s .- s_i?


#13

Ah yes, there’s no particular reason it should. That’ll teach me to actually think before typing. I’ll shut up now :slight_smile:


#14

The following works for small n at least, if you’re OK with adding artificial bounds on s (needed for these reformulations to work):

using JuMP
using Random
using LinearAlgebra
using PiecewiseLinearOpt

# problem data
Random.seed!(1)
n = 5
a = rand()
b = rand()
κ = rand()
c₀ = rand()
c = randn(n)
f = randn(n)

# using Gurobi
# model = Model(solver=GurobiSolver())
using Cbc
model = Model(solver=CbcSolver())
@variable model s₀
@variable model s[1 : n]
s⁺ = piecewiselinear.(Ref(model), s, Ref([-10, 0, 10]), x -> max(0, x)) # 10 is arbitrary
δ = Matrix{Variable}(undef, n, n)
for i = 1 : n
    δ[:, i] = piecewiselinear.(Ref(model), s - s[i], Ref([-10, 0, 10]), x -> max(0, x)) # 10 is arbitrary
end
@constraint model a * s₀ == c₀ + κ * s⁺ ⋅ f
for i = 1 : n
    @constraint model a * s₀ + b * s[i] == c[i] + δ[:, i] ⋅ f
end

solve(model)

Edit: this requires the master branch of PiecewiseLinearOpt on Julia 1.0.


#15

Here is my reformulation of your problem as a linear program:


#16

What values should be chosen for \alpha and \beta?


#17

Actually, after thinking a bit more about this, I don’t think my recasting is completely equivalent.


#18

Yes. Thanks for the clarification.


#20

This is weird. Either my understanding is off, not sure how, or the solver is doing something funny behind the scene.


#21

I have figured it out: it is trivial to show (eg proof by contradiction) that c_i < c_j \Rightarrow s_i < s_j, so we have the ordering and can just write the linear system because we know which s_k - s_j is positive, and solve the whole system given s_0, so it becomes a rootfinding problem in a single variable.

Which is great, because I need to solve it millions of times :wink:

Thanks for all the suggestions, I learned a lot from them.