How to perform an Off-Policy reinforcement learning using `ReinforcementLearning.jl`?

WuSiren · June 6, 2023, 1:46pm

I am a beginner of both Julia and RL. Now I’m trying to training a Q-based tabular policy with off-policy strategy using ReinforcementLearning.jl. Let behavior_policy be a RandomPolicy (for example) to collect experiences, I want the target_policy as a QBasedPolicy to be trained. I wrote the following code:

target_policy = QBasedPolicy(
    learner = MonteCarloLearner(;
        approximator=TabularQApproximator(
            ;n_state = length(state_space(wrapped_env)),
            n_action = length(action_space(wrapped_env)),
            opt = InvDecay(1)
        )
    ),
    explorer = EpsilonGreedyExplorer(0)
)

behavior_policy = RandomPolicy(action_space(wrapped_env))

p = OffPolicy(target_policy, behavior_policy)

agent = Agent(
    policy = p,
    trajectory = VectorSARTTrajectory()
)
hook = TotalRewardPerEpisode()
run(
    agent, 
    wrapped_env, 
    StopAfterEpisode(100), 
    hook
)

But it didn’t work as the p.π_target.learner.approximator.table still being all zero.

Is there anyone who can tell me where the problem lies? Do I take wrong understandings on off-policy reinforcement learning, or not properly use the OffPolicy method?

Thanks!

Topic		Replies	Views
Reinforcementlearning.jjl General Usage	3	293	May 8, 2024
Reinforcement Learning Package Machine Learning package , announcement	2	1996	October 15, 2018
Want to learn Reinforcement Learning from scratch for mini-project Machine Learning question	6	1631	August 17, 2022
[ANN] ReinforcementLearning.jl v0.4.0 Package Announcements	1	1128	April 12, 2020
Need help on how to start implementing reinforcement learning example using Julia General Usage	5	2445	November 20, 2019

How to perform an Off-Policy reinforcement learning using `ReinforcementLearning.jl`?

Related topics