Impossible actions in POMDPs.jl

vamp · May 7, 2022, 12:23pm

Hello,

For POMDPs.jl I have a big state space (bigger than 2.000 possible states). From certain states, I cannot take some actions, that is state-dependent actions.

I have check the documentation (link here). And for TabularTDLearning I guess I cannot use a function to determine all the possible actions for a state, so I gave a bad reward to impossible actions and the next state is the previous state, but as the state-space is quite big it doesn’t find the optimal policy.

Do you think it is the right way?

mkitti · May 14, 2022, 10:54am

Bump, and I will ping @zsunberg for you.

vamp · May 14, 2022, 11:04am

Thanks

rejuvyesh · May 14, 2022, 5:37pm

Current implementations in TabularTDLearning don’t support action masking and a PR would be very welcome. ReinforcementLearning.jl probably has better support there. POMDPs.jl (and other solvers in the ecosystem) do support state dependent legal actions; you just need to define actions(mdp, s) for your problem to define the legal action set for each state s.

Topic		Replies	Views
Using PPOPolicy with custom environment with action masking in ReinforcementLearning.jl Machine Learning question	14	1295	October 16, 2021
Help I am new: Is this MDP righ? New to Julia question	10	565	March 27, 2022
SF Bay Area Oct 12 event: (1) Partially Observable Markov Decision Processes in Julia; (2) Boxed Variables Meetups	0	649	October 6, 2017
Monte Carlo Tree Search New to Julia question , package	6	711	March 30, 2022
What is the difference between ReinforcementLearning.jl and pomdps.jl Specific Domains question , package , machine-learning	5	764	June 13, 2022

Impossible actions in POMDPs.jl

Related topics