Hi, I have a question regarding implementation of multidimensional array in continuous action spaces.
I am trying to implement DDPG algorithm for a problem which requires action space in 1-D vector.
A = [ClosedInterval{Int64}(0,BUDGET)] #range of value of actions in 0..Budget
A_space = repeat(A,length(CHANNELS)) #range of values for diff channels
RLBase.action_space(env::Environment, p::DefaultPlayer) = Space(A_space)
I am confused, how should I implement the above correctly, so i can get an action vector which i can clamp (for legal values) later during steps?
Any help or suggestion on how this is usually done is appreciated!
I am not sure I understood what you are asking, I would suggest you to give an MWE (i.e., an excerpt of code that is executable, your code does not define ClosedInterval) and make clearer what do you mean by “clamp”.
Thanks for warm welcome!
So, the problem is as follows :
when we have discrete action space we use
action_space = Base.OneTo(n_actions)
where n_actions means number of discrete actions (on or off etc)
when we are in continuous action space, so far i have only found code for one continuous variable i.e.,
action_space = -2.0..2.0
where the variable can have actions from -2 to 2 continuous.
But now instead of having one variable as continuous i need a vector of 3 elements to be in continuous space, all with range -2 to 2.
This is my question as how should i implement a vector for continuous space.
PS - by clamp i just meant to put a constraint on the range of the action vector.