How to represent delay between actions in SDDP?

Hi all. I am interested in building an SDDP.jl model where I want to incorporate some sort of minimum delay between consecutive actions.

If I was doing a standard MILP, I would say something like, I have a decision variable x_t \geq 0 for a series of some time points, you can think of it as the quantity of a product ordered at time t. Let’s say I want to enforce a minimum delay l between 2 placements of orders, then I could introduce binary variables y_t and two constraints to enforce this:

  • x_t \leq M * y_t \quad \forall t
  • \sum_{i=t}^{t+l-1} y_{t} \leq 1 \quad \forall t

I am having difficulty to think how to represent this in SDDP. Can someone help out?

As a further inquiry, and at risk of splitting the topic into two, is there some “library” of common (I assume) modeling motifs like this that are used across SDDP (or these sequential decision problems in general)?

To pass information between stages, you must add the information as a state variable.

See Access variables from a previous stage · SDDP.jl

You could do something like:

SDDP.LinearPolicyGraph() do sp, t
    l = 3
    @variable(sp, y_state[1:l], SDDP.State, Bin, initial_value = 0)
    @constraint(sp, [i in 2:l], y_state[i].out == y_state[i-1].in)
    @constraint(sp, sum(y_state[i].out for i in 1:l) <= 1)
    y_t = y_state[1].out
end

is there some “library” of common (I assume) modeling motifs like this that are used across SDDP (or these sequential decision problems in general)?

Nope. I’ve put some effort into the documentation at https://sddp.dev, but it leaves a lot to be desired.

1 Like

Thanks @odow!

Regarding documentation, I am wondering if we could work on a section somewhere that discusses how to talk about translation of these types of constraints from the typical deterministic formulation to the stagewise state variable formulation in a more generic way that might help ward off at least some % of questions like mine. For example in my case, leading a reader to see that the second constraint requires keeping a window of l orders for each time indexed node t, the elements of which (y_state[i]) giving whether or not an order was placed today, yesterday, etc, with the sum constraint obviously enforcing only 1 order/period. And that the “diagonal” (if drawn out horizontally) constraints between time nodes (your second constraint) moves information on orders between days (i.e. today’s order info becomes tomorrow’s info on yesterday, etc), which finally drops out after l days. And the implicit (added by SDDP) constraints that the out from t becomes the in for t+1 transfer the information across nodes.

Do you think some extension to the Access variables from a previous stage · SDDP.jl page with more description of this sort, perhaps with diagrams would be useful?

1 Like

Suggestions for improvements to the docs are always helpful. Feel free to open a PR.

p.s., I think you’re also massively overestimating how many people use SDDP.jl :smile: It’s somewhere in the 10^1 range, and definitely <<10^2. Also note that even if you model your problem like this, the policy we find might be quite suboptimal. There’s no guarantee we can find an optimal policy, or even a good one.

1 Like