Hi @mfg ,
Could you share the code of initializing the Trajectory
part?
It should be something like:
julia> trajectory = MaskedPPOTrajectory(;
capacity = UPDATE_FREQ,
state = Matrix{Float32} => (ns, N_ENV),
action = Vector{Int} => (N_ENV,),
legal_actions_mask = Vector{Bool} => (na, N_ENV),
action_log_prob = Vector{Float32} => (N_ENV,),
reward = Vector{Float32} => (N_ENV,),
terminal = Vector{Bool} => (N_ENV,),
)