I have a Markov decision process as follows:
states: S1, S2; the increment of S1 follows a Gamma law, i.e. S1(t+d)-S1(t) ~ gamma(alpha, beta), S1 has a fixed range of zero to a failure threshold FT (thus the transition of S1 is stochastic);
S2 is a discrete variable with an increment step of d (d>0), i.e., S2=0, d, 2d, …; the value fo S2 is lined to S1, every time S1 survives (value lower than FT) one step (d), the value of S2 increased d.
The action takes 0, 1, stands for two discrete actions on the system;
The objective is to select a sequence of actions to minimize an operation cost (related to reward).
Is there a suitable package in Julia to solve such MDP (like value iteration)?
Sorry, I just start trying to solve it in Julia, so I did not have any minimal code yet.