Hi,
I was implementing DDPG algorithm for 1-D vector action space but, during run it shows error as:
BoundsError: attempt to access 2-element Vector{Float64} at index []
Now, when looked into DDPG policy on Github
# TODO: handle Training/Testing mode
function (p::DDPGPolicy)(env)
p.step += 1
if p.step <= p.start_steps
p.start_policy(env)
else
D = device(p.behavior_actor)
s = state(env)
s = Flux.unsqueeze(s, ndims(s) + 1)
action = p.behavior_actor(send_to_device(D, s)) |> vec |> send_to_host
clamp(action[] + randn(p.rng) * p.act_noise, -p.act_limit, p.act_limit)
end
end
It has action[]
, so is it expecting action to be scalar. As we have action_space of form [1..3,2..3]
.
I am using start policy in DDPG agent as -
start_policy = RandomPolicy(action_space(env);rng)
Any help or suggestion on how this is usually done is appreciated!