Thanks for the suggestion. I will penalise the control effort. The main problem is it taking too much time for learning to reduce the loss (means the pendulum to get balanced).
Thanks for the suggestion. I will penalise the control effort. The main problem is it taking too much time for learning to reduce the loss (means the pendulum to get balanced).