How does SDDP provide decision for an out-of-sample state?

petr-a · April 26, 2025, 1:33pm

Hello!

I’m new to SDDP and Julia and I wonder how the policy evaluation works (math behind it) for states which are out of the scenario lattice which is used to train the algorithm?

For context: I’m solving the problem of asset management (the control variable is just the amount of long or short position in the asset, which is bounded) and optimizing the linear combination of expectation of the position cost on the latest stage and CVaR of it.
I’m training the algorithm on a scenario lattice I constructed which represents the costs of asset on each day with corresponding probabilities.
Then I want to evaluate it on historical data - the real path of asset, which prices can be out of sample (between nodes) of the scenario lattice.

The (!Note) in the documentation (Theory intro: Implementation: evaluating the policy) states that it can be out-of-sample:
" The random variable can be out-of-sample , i.e., it doesn’t have to be in the vector ΩΩ we created when defining the model! This is a notable difference to other multistage stochastic solution methods like progressive hedging or using the deterministic equivalent."

And it really does work when I implement it, but it doesn’t explain how. Does it simply take the nearest node from the lattice or can it evaluate different policies for any continious state? I want to get the math behind it.

Please let me know where I can read more about it?

odow · April 27, 2025, 8:10pm

Hi @petr-a, welcome the the forum

Are you talking about an example like Example: the milk producer · SDDP.jl?

If so, yes, it takes the nearest node in the lattice. This is one reason why it is limited to univariate random variables for now. In theory we could extend to more dimensions, it’s just more complicated to implement.

More generally:

We can trivially use any out-of-sample realization for the random variable within a node
Out-of-sample realizations for “off-chain” nodes are much harder
The simplest solution is to use the nearest neighbor
If your Markov state (e.g., price) is really continuous, then your value function will be a piecewise step function in the price dimension and piecewise linear convex in the state dimensions
You could imagine doing cleverer things than nearest neighbor. For example, you might want to linearly interpolate between adjacent nodes. That’s exactly what Objective states · SDDP.jl does. But it’s much more computationally challenging to train.
You could potentially imagine training using the nearest neighbor approach, and then post-processing the value functions to the linear interpolation format for simulation and evaluation, but SDDP.jl doesn’t support that yet.

petr-a · April 28, 2025, 8:47am

Hi @odow ! Thank you for your response!

Yes, my task is similar to the milk producer example.
Thank you for the detailed explanation!
I’ll look onto the Objective states, I think nearest neighbor is fine to me, I will stick to it for now.

Topic		Replies	Views
Best way to train and evaluate SDDP with constraints including product of control and states variables? Optimization (Mathematical) question , jump , sddp	1	43	May 10, 2025
SDDP simulation command Optimization (Mathematical) question , optimization	5	210	April 8, 2023
SDDP model sometimes fails during training, or simulation Optimization (Mathematical)	2	26	June 23, 2025
Finding the minimum of the Bellman operator inside SDDP iterations Optimization (Mathematical)	5	43	April 13, 2025
Historical data - SDDP.jl Optimization (Mathematical) sddp	6	190	August 6, 2024

How does SDDP provide decision for an out-of-sample state?

Related topics