How to solve infinite horizon problem under Makovian policy graph in SDDP.jl?

Engr_Moiz_Ahmad · May 28, 2025, 8:30pm

I am dealing with a multi-stage mixed-integer stochastic global supply chain optimization problem and want to solve infinite-horizon problem instances under a Markovian policy graph. I am using the Markovian policy graph because the constraint matrix contains uncertain parameters (in a linear programming model).

Can the infinite horizon problem instance be solved under a Markovian Policy graph using SDDP.jl?

odow · May 28, 2025, 8:36pm

Can the infinite horizon problem instance be solved under a Markovian Policy graph using SDDP.jl?

Yes.

See Create a general policy graph · SDDP.jl for instructions on how to create an arbitrary graph. I would start with a SDDP.MarkovianGraph and then add the cyclic edges.

Here’s the code for the model in https://doi.org/10.1002/net.21932:

github.com/odow/SDDP.jl

papers/policy_graph/paper.jl

b072e859b


      
          # ===== Markovian Graph =====
          transition = Array{Float64,2}[]
          for transition_matrix in data["transition"]
              push!(
                  transition,
                  convert(
                      Array{Float64,2},
                      Base.reshape(
                          vcat(transition_matrix...),
                          length(transition_matrix[1]),
                          length(transition_matrix),
                      ),
                  ),
              )
          end
          graph = SDDP.MarkovianGraph(transition)
          for markov_state = 1:size(transition[end], 2)
              SDDP.add_edge(
                  graph,
                  (data["number_of_weeks"], markov_state) => (1, 1),

This file has been truncated. show original

Engr_Moiz_Ahmad · May 28, 2025, 8:58pm

Got it, I may ask, can I specify several root nodes, or do I have to specify a dummy root node for it to accurately implement the formulation?

Additionally, can I dynamically create and specify decision node and edge names in a for loop, for example, without having to enumerate each of them one by one?

Engr_Moiz_Ahmad · May 28, 2025, 9:32pm

Found the solution of my second question in the above comment: use Symbol("decision_node_name") to specify the symbols of nodes and edges dynamically. Thnaks.

odow · May 28, 2025, 10:32pm

No. There must be only a single root node. (The circle node in a policy graph diagram.)

Additionally, can I dynamically create and specify decision node and edge names in a for loop, for example, without having to enumerate each of them one by one?

Yes, see Create a general policy graph · SDDP.jl. The graph is a regular Julia data structure. You can use for-loops etc to build it as you would any other data structure.

You might, for example, but a graph like this:

import SDDP
T = 12
M = [0.2 0.8; 0.8 0.2]
graph = SDDP.Graph((0, 0))
for t in 1:T, i in 1:2
    SDDP.add_node(graph, (t, i))
end
for i in 1:2
    SDDP.add_edge(graph, (0, 0) => (1, i), 0.5)
end
for t in 2:T, i in 1:2, j in 1:2
    SDDP.add_edge(graph, (t-1, i) => (t, j), M[i, j])
end
for i in 1:2, j in 1:2
    SDDP.add_edge(graph, (T, i) => (1, j), M[i, j])
end

Engr_Moiz_Ahmad · May 28, 2025, 11:16pm

Got it, thanks.

Engr_Moiz_Ahmad · May 29, 2025, 12:17am

In the presence of an infinite planning horizon with termination upon cycles, how can I access the stage number within a particular episode?

odow · May 29, 2025, 12:43am

Could you clarify the question? You mean when simulating? See the :node_index field.

See also Create a general policy graph · SDDP.jl

Engr_Moiz_Ahmad · May 29, 2025, 12:55am

I meant during training simulations, I need to access the stage number within a particular forward pass while using the general policy graph with infinite horizon.

odow · May 29, 2025, 1:08am

This is not possible. The model assumes that you do not need to know how many steps you are away from root node. If your model really depends on the number of steps from the root node, then it is not cyclic.

Engr_Moiz_Ahmad · May 29, 2025, 1:29am

Right, got it.

Then does it mean that the model that is dependent on this information cannot be solved for an infinite horizon by SDDP.jl or the SDDiP algorithm in general?

P.S.
There are stochastic variables in the constraint matrix, leading to the usage of the Markovian and general policy graphs.

odow · May 29, 2025, 1:43am

What is the problem you are trying to solve? Can you provide a reproducible example? Have you formulated the problem as a policy graph? What are the state variables? What are the control variables? What are the node-wise independent random variables?

You can have node-wise independent noise in the constraint matrix: Add noise in the constraint matrix · SDDP.jl. Just make it so that the noise depends on the node index, and not the number of steps from the root node.

Engr_Moiz_Ahmad · May 29, 2025, 1:46am

In the case of my problem, the inventory qualities are deteriorating with the passage of time by decay factors (different with respect to time periods), but the physical amount of inventory is not deteriorating. The physical amount is actually useful for calculating several costs in the objective. This means that the values of inventories from previous time periods are discounted more than those of inventories from recent time periods.

It is from here that the time dependency comes in.

I will try my best to provide a reproducible example, but it will be difficult.

odow · May 29, 2025, 2:13am

See Access variables from a previous stage · SDDP.jl

If you want to model quality decay, you must bucket your inventory into a small finite set of quality classes, model them in the state variable, and use constraints to model the transition of inventory between the quality buckets.

I used to have a tutorial on this: Update the perishable widgets example · Issue #591 · odow/SDDP.jl · GitHub but I removed the perishable stuff when I wrote Example: the milk producer · SDDP.jl

Engr_Moiz_Ahmad · May 29, 2025, 2:30am

But in the case of the infinite horizon case, there will be a possibility of infinite quality buckets (we may approximate by accumulating quality buckets for supplies way back in the planning horizon to solve this problem).

And if we model it that way, there will be a huge increase in binary integer variables, significantly increasing the intractability.

Is there any other way to solve the infinite horizon case using SDDiP algorithm?

odow · May 29, 2025, 3:27am

If you want to use SDDP.jl, you must formulate the model as a policy graph. That means you’ll need to make some approximations. You probably don’t have an infinite quality buckets. After some time the product gets chucked out, or the quality doesn’t deteriorate further.

And yes, there will be more variables. But it’s not increasing the intractability. Your problem is intractable. You cannot ever hope to solve it to global optimality. You need to make some simplifying assumptions.

Is there any other way to solve the infinite horizon case using SDDiP algorithm?

Nope. You must model information from previous stages in the stage variable.

Your alternative is to come up with some custom heuristic policy (not using SDDP.jl) and evaluate that. Warren Powell has a large literature on this. Do some cost-function approximation type algorithm. (Of course, the downside to CFA is that Powell can’t tell you what the right cost function is. You need to use domain knowledge. The upside to SDDP.jl is that if you can model it as a policy graph, then we can find a primal feasible policy.)

Engr_Moiz_Ahmad · May 29, 2025, 7:28pm

Got it.

Topic		Replies	Views
Cyclic policy graph with fix number of cycles - SDDP.jl Optimization (Mathematical) question , sddp	1	79	July 13, 2024
End-of-horizon effect, Infinite horizon policy graph, SDDP.jl Optimization (Mathematical) question , sddp	11	179	June 14, 2024
Policy Graph in SDDP.jl package Optimization (Mathematical) question , optimization	1	309	June 8, 2023
Finding the minimum of the Bellman operator inside SDDP iterations Optimization (Mathematical)	5	43	April 13, 2025
Historical simulation in SDDP.jl Optimization (Mathematical) question , sddp	3	210	January 30, 2024

How to solve infinite horizon problem under Makovian policy graph in SDDP.jl?

Related topics