Reinforcement Learning Custom Gym error

I’ve created a custom Gym environment using ReinforcementLearning.jl. I think it adheres to the guidance on the documentation [ “How to write a customised environment” ] , but the trivial case to run my gym, using a random policy, throws an error.

The project is here: GitHub - chunky/julia_gym_dll: Julia glue for a Gym written in C/C++

A notebook [committed, for my sins, with the error included], is here: julia_gym_dll/Test_DLLGym.ipynb at master · chunky/julia_gym_dll · GitHub
The default everything there should be an inverted pendulum Gym, and it should be aligned with the pendulum example in ReinforcementLearning.jl

I’ll paste the error here:

env = DLLGymEnv();
run(RandomPolicy(), env, StopAfterStep(1000), TotalRewardPerEpisode())

MethodError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

  [1] reduce_empty(op::Base.MappingRF{Base.ExtremaMap{typeof(identity)}, typeof(Base._extrema_rf)}, #unused#::Type{Int64})
    @ Base ./reduce.jl:348
  [2] reduce_empty_iter
    @ ./reduce.jl:371 [inlined]
  [3] mapreduce_empty_iter(f::Function, op::Function, itr::Vector{Int64}, ItrEltype::Base.HasEltype)
    @ Base ./reduce.jl:367
  [4] _mapreduce
    @ ./reduce.jl:419 [inlined]
  [5] _mapreduce_dim
    @ ./reducedim.jl:365 [inlined]
  [6] #mapreduce#764
    @ ./reducedim.jl:357 [inlined]
  [7] mapreduce
    @ ./reducedim.jl:357 [inlined]
  [8] #_extrema#790
    @ ./reducedim.jl:999 [inlined]
  [9] _extrema
    @ ./reducedim.jl:999 [inlined]
 [10] #_extrema#789
    @ ./reducedim.jl:998 [inlined]
 [11] _extrema
    @ ./reducedim.jl:998 [inlined]
 [12] #extrema#787
    @ ./reducedim.jl:994 [inlined]
 [13] extrema
    @ ./reducedim.jl:994 [inlined]
 [14] extend_limits(vec::Vector{Int64}, limits::Tuple{Int64, Int64}, scale::typeof(identity))
    @ UnicodePlots ~/.julia/packages/UnicodePlots/Z7FG6/src/common.jl:371
 [15] UnicodePlots.Plot(x::Base.OneTo{Int64}, y::Vector{Float64}, z::Nothing, ::Type{UnicodePlots.BrailleCanvas}; title::String, xlabel::String, ylabel::String, zlabel::String, xscale::Symbol, yscale::Symbol, width::Nothing, height::Nothing, border::Symbol, compact::Bool, blend::Bool, xlim::Tuple{Int64, Int64}, ylim::Tuple{Int64, Int64}, margin::Int64, padding::Int64, labels::Bool, unicode_exponent::Bool, colorbar::Bool, colorbar_border::Symbol, colorbar_lim::Tuple{Int64, Int64}, colormap::Nothing, grid::Bool, xticks::Bool, yticks::Bool, min_width::Int64, min_height::Int64, projection::Nothing, axes3d::Bool, kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ UnicodePlots ~/.julia/packages/UnicodePlots/Z7FG6/src/plot.jl:240
 [16] lineplot(x::Base.OneTo{Int64}, y::Vector{Float64}, z::Nothing; canvas::Type, color::Symbol, name::String, head_tail::Nothing, kw::Base.Pairs{Symbol, String, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:title, :xlabel, :ylabel), Tuple{String, String, String}}})
    @ UnicodePlots ~/.julia/packages/UnicodePlots/Z7FG6/src/interface/lineplot.jl:75
 [17] #lineplot#133
    @ ~/.julia/packages/UnicodePlots/Z7FG6/src/interface/lineplot.jl:79 [inlined]
 [18] (::TotalRewardPerEpisode)(#unused#::PostExperimentStage, agent::RandomPolicy{Nothing, Random._GLOBAL_RNG}, env::DLLGymEnv)
    @ ReinforcementLearningCore ~/.julia/packages/ReinforcementLearningCore/yeRLW/src/core/hooks.jl:160
 [19] _run(policy::RandomPolicy{Nothing, Random._GLOBAL_RNG}, env::DLLGymEnv, stop_condition::StopAfterStep{ProgressMeter.Progress}, hook::TotalRewardPerEpisode)
    @ ReinforcementLearningCore ~/.julia/packages/ReinforcementLearningCore/yeRLW/src/core/run.jl:48
 [20] run(policy::RandomPolicy{Nothing, Random._GLOBAL_RNG}, env::DLLGymEnv, stop_condition::StopAfterStep{ProgressMeter.Progress}, hook::TotalRewardPerEpisode)
    @ ReinforcementLearningCore ~/.julia/packages/ReinforcementLearningCore/yeRLW/src/core/run.jl:10
 [21] top-level scope
    @ In[10]:2
 [22] eval
    @ ./boot.jl:368 [inlined]
 [23] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1428

Any guidance would be appreciated

Just guessing based on the stacktrace: maybe the simulation stopped before a single episode ended and therefore an empty vector is passed to UnicodePlots, which produces this error message when you try e.g. using UnicodePlots; Plot(Float64[], Float64[]). Did you try to omit the plot? Maybe just adding a semicolon after the run function would be sufficient to suppress the plot and the error message?

You’re exactly right! The C gym’s return value of whether it’s done or not, is inverted compared to ReinforcementLearning.jl’s view.