Tabular MDP/Q-learning solver: Reward Matrix R(s, a, s') vs. R(s, a)

Richard_Wright · September 19, 2022, 4:52am

Hello Folks,

I am using the POMDP package along with the TabularTDLearning package

In the first part of the problem I worked out a 3D transition matrix T(s’, a, s) and a 2D reward matrix R(s, a). Then formulated it as a tabular MDP and solved it using the QLearningSolver function which is part of the TabularTDLearning package.

Next, suppose the R-matrix changes as R(s, a, s’), can the the TabularMDP function formulate it? I have my doubts because the documentation states that the R-matrix must be 2D.

How should I handle this kind of R-matrix? Please advice. Thank you.

Topic		Replies	Views
Help I am new: Is this MDP righ? New to Julia question	10	535	March 27, 2022
What is the difference between ReinforcementLearning.jl and pomdps.jl Specific Domains question , package , machine-learning	5	718	June 13, 2022
Print the MDP q-matrix New to Julia	4	311	September 18, 2022
Porting an example from QuantEcon.jl to POMDPs.jl Specific Domains	1	623	November 21, 2019
Impossible actions in POMDPs.jl New to Julia question , package	3	551	May 14, 2022

Tabular MDP/Q-learning solver: Reward Matrix R(s, a, s') vs. R(s, a)

Related topics