Transitioning an LP to stochastic version with InfiniteOpt.jl (+ feeding in samples ?)

Hello! I was searching to transition some of my deterministic ILP models to a stochastic version.
I started investigating InfiniteOpt.jl and made a simple stochastic flavor of the JuMP LP example.

I just modeled a parameter of the constraint to be a Normal around the initial value, i.e. ξ ~ Normal(6,1) instead of scalar ξ = 6. I was expecting the result to be the same since the expectation E(ξ) = 6 and for this problem I think the Value of the Stochastic Solution (VSS) should be zero. However InfiniteOpt.jl returns somewhat random results.
Do I need to somehow increase the granularity for the transition of an infinite problem description to the finite version? If yes, how can I do it ? I tried playing around with the num_supports argument but didn’t have any luck.

MWE:
The deterministic initial LP program:

using JuMP, Graphs, InfiniteOpt, Distributions, HiGHS

# Deterministic problem according to https://jump.dev/JuMP.jl/stable/tutorials/getting_started/getting_started_with_JuMP/#An-example
function modeldet()
	model = Model(HiGHS.Optimizer)

	@variable(model, x >= 0)
	@variable(model, 0 <= y <= 3)
	
	ξ = 6
	@constraint(model, c1, ξ * x + 8y >= 100)
	@constraint(model, c2, 7x + 12y >= 120)

	@objective(model, Min, 12x + 20y)

	optimize!(model)

	return model
end

and the InfiniteOpt.jl version:

# Introduce some uncertainty in the parameters using probability density functions
function modelinf()
	model = InfiniteModel(HiGHS.Optimizer)

	@variable(model, x >= 0)
	@variable(model, 0 <= y <= 3)

	@infinite_parameter(model, ξ ~ Normal(6,1))

	@constraint(model, c1, ξ * x + 8y >= 100)
	@constraint(model, c2, 7x + 12y >= 120)

	@objective(model, Min, 12x + 20y)

	optimize!(model)

	return model
end

Plotting the two problem constraints gives a clear intuition where the optimum decision variable should be. The green constraint is stochastic and parametrized by ξ.

Makie code for plotting
using GLMakie 
function plotprob()
	f = Figure()
	a = Axis(f[1,1])

	for ξ in rand(Normal(6,1), 50)
		plot!(a, -100:100, x -> (100 - ξ * x)/8, color = :green, linewidth=1)
	end

	plot!(a, -100:100, x -> (120 - 7x)/12, color = :blue, linewidth=5)

	# objective value
	# for ob in 0:50:500
	# 	plot!(a, 0:30, x -> (ob - 12x)/20, color = :black)
	# end

	return f
end

The deterministic JuMP program finds the the optimum solution:

julia> md = modeldet();
Running HiGHS 1.5.3 [date: 1970-01-01, git hash: 45a127b78]
Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
2 rows, 2 cols, 4 nonzeros
2 rows, 2 cols, 4 nonzeros
Presolve : Reductions: rows 2(-0); columns 2(-0); elements 4(-0) - Not reduced
Problem not reduced by presolve: solving the LP
Using EKK dual simplex solver - serial
  Iteration        Objective     Infeasibilities num(sum)
          0     0.0000000000e+00 Pr: 2(220) 0s
          2     2.0500000000e+02 Pr: 0(0) 0s
Model   status      : Optimal
Simplex   iterations: 2
Objective value     :  2.0500000000e+02
HiGHS run time      :          0.00

julia> value(md[:x])
15.000000000000005

But the InfiniteOpt.jl struggles to find the same

julia> mdi = modelinf();
Running HiGHS 1.5.3 [date: 1970-01-01, git hash: 45a127b78]
Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
2 rows, 2 cols, 4 nonzeros
2 rows, 2 cols, 4 nonzeros
Presolve : Reductions: rows 2(-9); columns 2(-0); elements 4(-18)
Solving the presolved LP
Using EKK dual simplex solver - serial
  Iteration        Objective     Infeasibilities num(sum)
          0     0.0000000000e+00 Pr: 2(200) 0s
          2     2.7496874512e+02 Pr: 0(0) 0s
Solving the original LP from the solution after postsolve
Model   status      : Optimal
Simplex   iterations: 2
Objective value     :  2.7496874512e+02
HiGHS run time      :          0.00

julia> value(mdi[:x])
17.91406209370758

I also get high variability, which appear to be consistently above the optimum

julia> scatter([value(modelinf()[:x]) for _ in 1:40])

So my question is how can I write the stochastic program in order to get a reliable solution ?

Feeding in samples of a Random Variable instead of the distribution ?

A bit of irrelevant but I was also wondering whether I can feed into the model samples instead of a well defined distribution. The idea is that I am estimating the parameters using MCMC Bayesian modeling, thus I end up having a chain of values that describe the parameter.
Ofc, I could try to simplify the chain into some kind of discrete distribution, but it feels like going from finite samples to an infinite PDF just to go back to finite problem (through InfiniteOpt.jl transition, i.e. finite samples) might be a waste of accuracy (?)

So you need to step back a bit and consider how you want to handle the stochasticity.

There are two main options:

  • Do you you want the optimal (x, y) that holds for all ξ in Normal(6,1)?
  • Do you you want an optimal fixed decision for y, and a set of optimal solutions x(ξ) for all realizations of ξ?

I’m guessing you probably want something like the second, which is a two-stage stochastic program:
https://infiniteopt.github.io/InfiniteOpt.jl/stable/examples/Stochastic%20Optimization/farmer/

If you want to feed in samples, it’d be easier to just write your own.

using JuMP, HiGHS

function main(Ω::Vector)
	N = length(Ω)
	model = Model(HiGHS.Optimizer)
	@variable(model, x[1:N] >= 0)
	@variable(model, 0 <= y <= 3)
	@constraint(model, [i=1:N], Ω[i] * x[i] + 8 y >= 100)
	@constraint(model, [i=1:N], 7x[i] + 12y >= 120)
	@objective(model, Min, 1 / N * sum(12x[i] + 20y for i in 1:N))
	optimize!(model)
	return value.(x), value(y)
end

x, y = main([-0.5, 0.0, 0.2, 1.2])
1 Like

Thanks for your comment !

Do you you want the optimal (x, y) that holds for all ξ in Normal(6,1)?

This would just collapse to the single most strict constraint, so it is not really what I want.
If the stochasticity was in the objective function, this would be easier.

Do you you want an optimal fixed decision for y, and a set of optimal solutions x(ξ) for all realizations of ξ?

If we are going this way I would prefer all the decision variables to be independent per sample; so having x(ξ) and y(ξ).

With the two-stage stochastic program you mention, I know have a distribution of solutions (x, y), which leaves the decision making to me. I was hoping that stochastic program could relieve me from this burden, by directly providing a statistically optimal decision.

This seems to be the case with the InfiniteOpt.jl as in the example

https://infiniteopt.github.io/InfiniteOpt.jl/stable/examples/Stochastic%20Optimization/farmer/

But going over the code, I cannot find what could be the problem with my formulation in modelinf. The only difference I find is that I don’t have the expectation on the objective function, but that’s because the parameter ξ doesn’t influence it in my case.

by directly providing a statistically optimal decision

I think this is the source of the confusion. There is no single “optimal” decision once you have uncertainty.

If you want a single (x, y) that is feasible under all scenarios, then the problem is infeasible because there is no such value (consider that ξ = -1 is a valid realization of the uncertainty, so the constraint c1 cannot hold in that case) .

Perhaps to take a step back:

  • What is the problem you are trying to solve?
  • What is the desired output (ignore how you should compute it)?
  • What is the uncertainty and why is it important?

Admittedly, I was just playing around with a simplistic example in order to understand it and later adapt it to my application.

A simple version of my problem is finding the optimal network flow in a graph.
Some explanation below…
Given

  • a graph G(V,E) gr with vertices V and edges E
  • a demand matrix D dd (how much demand should be deployed between a node pair)
  • a capacity per graph edge C caps (how much deployed demand can an edge tolerate)
  • weights or costs for using an edge W_e for e \in E whs

Find the optimal routing.

Following a link-flow formulation it’s pretty easy to solve this, assuming some candidate paths for P_d for d \in D and the corresponding costs W_d.
The objective function with \alpha and \beta the coefficients

\text{min } \alpha \sum_{d \in E}\sum_{p \in P_d}{W_d} * r_{d,p} + \beta \sum_{d \in D}\sum_{p \in P_d}(1-r_{d,p}) * d

given the constraints:

r_{d,p} \text{ is binary} \\ \sum_{p \in P_d} \sum_{d \in D} ~ {r_{d,p} * d * \delta_{e,p} \leq C_e} ~ \forall ~ e \in E \\ \sum_{p \in P_d} r_{d,p} \leq 1 ~ \forall ~ d \in D

The \delta_{e,p} is 1 if e \in p and 0 otherwise.
(I was a bit hasty writing it down, so please forgive any inconsistencies)

The implementation can be as following:

using Graphs, HiGHS, DataStructures, Random, JuMP
import LinearAlgebra:Symmetric


function netflowilp(gr::AbstractGraph, demdict, caps, allpaths, allpathcosts; costcoeff=1 , blockcoeff=100)
    model = Model(HiGHS.Optimizer)

    vecdemdict = collect(demdict)
    vecedges = collect(edges(gr))

	# decision variable: each demand in `vecdemdict` is routed in a path in `allpaths`
	@variable(model, 𝑟[d = eachindex(vecdemdict), p=eachindex(allpaths[d])], Bin)

    # do not surpass link capacity
    @constraint(model, c1[e in eachindex(vecedges)], sum(𝑟[d,p] * vecdemdict[d][2] for d in eachindex(vecdemdict) for p in eachindex(allpaths[d]) 
                                                         if vecedges[e] in edgeify(allpaths[d][p])) ≤ caps[e])

	# allow at most one path per demand (blocking demand allowed)
	@constraint(model, c2[d in eachindex(vecdemdict)], sum(𝑟[d, p] for p in eachindex(allpaths[d])) ≤ 1)

	# Charge cost per used edge proportional to capacity and edge weight
	# Charge blocking demand punishment
    @expressions(model, begin
                     edgecosts, sum(𝑟[d,p] * vecdemdict[d][2] * allpathcosts[d][p] for d in eachindex(vecdemdict) for p in eachindex(allpaths[d]))
                     blockeddemands, sum((1 - sum(𝑟[d,p] for p in eachindex(allpaths[d]))) * vecdemdict[d][2] for d in eachindex(vecdemdict) )
                 end)

    exs = [edgecosts, blockeddemands]
    coeffs = [costcoeff, blockcoeff]
    @expression(model, Φ, coeffs' * exs)

    @objective(model, Min, Φ);

    optimize!(model)

	return model
end


# helper functions
edgeify(path) = map(x -> Edge(minimum(x), maximum(x)) , zip(path, path[2:end]));
randdems(gr::AbstractGraph, rng=MersenneTwister(0)) = OrderedDict( ed => abs(5 + randn(rng)) for ed in edges(gr) )
randcaps(gr::AbstractGraph, rng=MersenneTwister(0)) = abs.(20 .+ 2 .* randn(rng, ne(gr)))
randweights(gr::AbstractGraph, rng=MersenneTwister(0)) = Symmetric(rand(rng, nv(gr), nv(gr)))


function candidatepaths(gr::AbstractGraph, dems::OrderedDict, whs=ones(nv(gr), nv(gr)), K=5)
	allyens = [yen_k_shortest_paths(gr, src(ed), dst(ed), whs, K) for ed in keys(dems)]
    return getfield.(allyens, :paths), getfield.(allyens, :dists)
end


function runilp()
	gr = barabasi_albert(6, 2; seed=42)
	dd = randdems(gr)
	caps = randcaps(gr)
	whs = randweights(gr)
	candpaths, candpathscosts = candidatepaths(gr, dd, whs)
	netflowilp(gr, dd, caps, candpaths, candpathscosts)
end

Run with

julia> runilp()

Now let’s say that the capacity of the edges C is uncertain, and I can describe it with a PDF.
I would still like to make a decision about the routing strategy r_{d,p} that is more likely to minimize my objective function. Since there is already a different machinery making the PDF estimation, I simply would like to use stochastic programming to make a single decision (so not really a multi-stage problem). If estimations are changing with time, I would just rerun the stochastic program with the udpated parameters.

I think this is really similar to the example I pointed in my post, since the stochasticity is present only in the constraints, so feel free to consult me w.r.t this one if you find it simpler !

Now let’s say that the capacity of the edges C is uncertain

So should your solution be feasible with respect to all possible realizations of capacity? What happens if you pick a solution, then realize the capacity and the solution is now infeasible?

It seems like you might want to model this as a “chance constraint” (Google if you haven’t seen these before), where you choose to satisfy the constraint probabilistically, say 95% of the time.

If you search for “maximum flow” and “stochastic arc capacity,” I assume there are a number of related papers. How is this usually modeled and solved?

So should your solution be feasible with respect to all possible realizations of capacity?

No, because then it will collapse down to the stricter single capacity constraint.

What happens if you pick a solution, then realize the capacity and the solution is now infeasible?

Some demands will be blocked during realization in the later stage.

It seems like you might want to model this as a “chance constraint”

That sounds about right. Thanks for the direction. Before I jump into this, is this something it can be set up by a user in JuMP ?

This reminded me a bit of “soft constrainst” where we introduce an extra variable in the constraint, e.g. s s.t.

\sum_{p \in P_d} \sum_{d \in D} r_{d,p} * d * \delta_{e,p} \leq C_i + s_e ~ \forall ~ e \in E ~~~ \forall ~ i \in \text{#samples}

I guess if you feed in all samples C_i and you minimize s_e, you will in the end have a single routing configuration r_{d,p} that minimizes violating the capacity constraints.
That looks okey, but it was my impression that stochastic programming could provide some out-of-the-box tooling on helping with such cases.

How is this usually modeled and solved?

Probably there will be many heuristics and approximation algorithms for these, but since my problem domain will be more complicated I doubt that they will be applicable. Therefore, I was searching for a more general methodology. That said, ofc it’s certainly useful to know.

but it was my impression that stochastic programming could provide some out-of-the-box tooling on helping with such cases.

Not in the case where the uncertainty impacts feasibility. Google “relatively complete recourse”

This reminded me a bit of “soft constrainst” where we introduce an extra variable in the constraint

Ahhhhhh. Perhaps you want to minimize the expected violation of the constraints? (Or perhaps some other risk measure of the violation.) This is a reasonable approach.

It’d look something like:

using JuMP, HiGHS

function main(samples::Vector)
    N = length(samples)
	model = Model(HiGHS.Optimizer)
	@variable(model, x >= 0)
	@variable(model, 0 <= y <= 3)
	@variable(model, violation[1:N] >= 0)
	@constraint(model, [i = 1:N], samples[i] * x + 8y + violation[i] >= 100)
	@constraint(model, 7x + 12y >= 120)
    weight = 0.5  # Move between 0 and 1 to trade-off objective vs violation
	@objective(
        model, 
        Min, 
        weight * (12x + 20y) + (1 - weight) * (1 / N * sum(samples)),
    )
	optimize!(model)
	return value(x), value(y)
end

You could also solve the problem as a bi-objective optimization problem, where you trade off the minimum cost and the feasibility violation:

using JuMP
import HiGHS
import MultiObjectiveAlgorithms as MOA

function main(samples::Vector)
    N = length(samples)
	model = Model(() -> MOA.Optimizer(HiGHS.Optimizer))
    set_silent(model)
    set_attribute(model, MOA.Algorithm(), MOA.Dichotomy())
    set_attribute(model, MOA.SolutionLimit(), 10)
	@variable(model, x >= 0)
	@variable(model, 0 <= y <= 3)
	@variable(model, violation[1:N] >= 0)
	@constraint(model, [i = 1:N], samples[i] * x + 8y + violation[i] >= 100)
	@constraint(model, 7x + 12y >= 120)
	@objective(model, Min, [12x + 20y, 1 / N * sum(samples)])
	optimize!(model)
    return [
        (value(x; result = i), value(y; result = i)) for
        i in 1:result_count(model)
    ]
end

samples = sort(rand(100))
solution = main(samples)

See Multi-objective knapsack · JuMP for more details.

1 Like