How to perform an Off-Policy reinforcement learning using `ReinforcementLearning.jl`?

I am a beginner of both Julia and RL. Now I’m trying to training a Q-based tabular policy with off-policy strategy using ReinforcementLearning.jl. Let behavior_policy be a RandomPolicy (for example) to collect experiences, I want the target_policy as a QBasedPolicy to be trained. I wrote the following code:

target_policy = QBasedPolicy(
    learner = MonteCarloLearner(;
        approximator=TabularQApproximator(
            ;n_state = length(state_space(wrapped_env)),
            n_action = length(action_space(wrapped_env)),
            opt = InvDecay(1)
        )
    ),
    explorer = EpsilonGreedyExplorer(0)
)

behavior_policy = RandomPolicy(action_space(wrapped_env))

p = OffPolicy(target_policy, behavior_policy)

agent = Agent(
    policy = p,
    trajectory = VectorSARTTrajectory()
)
hook = TotalRewardPerEpisode()
run(
    agent, 
    wrapped_env, 
    StopAfterEpisode(100), 
    hook
)

But it didn’t work as the p.π_target.learner.approximator.table still being all zero.

Is there anyone who can tell me where the problem lies? Do I take wrong understandings on off-policy reinforcement learning, or not properly use the OffPolicy method?

Thanks!