Using PPOPolicy with custom environment with action masking in ReinforcementLearning.jl

findmyway · October 12, 2021, 4:17pm

Could you share the code of initializing the Trajectory part?

It should be something like:

julia> trajectory = MaskedPPOTrajectory(;
                   capacity = UPDATE_FREQ,
                   state = Matrix{Float32} => (ns, N_ENV),
                   action = Vector{Int} => (N_ENV,),
                   legal_actions_mask = Vector{Bool} => (na, N_ENV),
                   action_log_prob = Vector{Float32} => (N_ENV,),
                   reward = Vector{Float32} => (N_ENV,),
                   terminal = Vector{Bool} => (N_ENV,),
               )

Topic		Replies	Views
Error with CircularBufferArrays in ReinforcementLearning.jl Machine Learning question , package , error	5	735	July 11, 2021
Trouble setting up PPO with reinforcementlearning Machine Learning rl	0	152	July 5, 2024
Impossible actions in POMDPs.jl New to Julia question , package	3	548	May 14, 2022
Issue with ReinforcementLearning.jl BasicDQN with custom environment Machine Learning question , error	3	729	June 22, 2021
[ANN] ReinforcementLearning.jl v0.4.0 Package Announcements	1	1128	April 12, 2020

Using PPOPolicy with custom environment with action masking in ReinforcementLearning.jl

Related topics