ReinforcementLearning Gym spaces

chunky · September 11, 2022, 5:04am

I have a C++ Gym exposed as a C .dll/.so whose API was originally designed to be used for a Python OpenAI Gym using ctypes. Specifically, this only relies on “Box” for both action and observation; the continuous, multidimensional, space. In Python, one creates a “Box” by passing two parallel arrays, one of “high” range values, one for “now”, and the C API aligns with this idea.

For a toy test case, I’ve ginned up an example. The C code is thus:

int get_action_len() {
        printf("In get_action_len\n");
        int len = get_action_space(NULL, NULL, 0);
        return len;
}

int get_action_space(double obs_low[], double obs_high[], int len) {
        printf("In get_action_space. length=%d\n", len);
        if(NULL != obs_low && NULL != obs_high) {
                obs_low[0] = -1.0;
                obs_high[0] = 1.0;
        }
        return 1;
}

And as it stands, this is the Julia code I’ve created:

function RLBase.action_space(env::DLLGymEnv)
    action_len = ccall(("get_action_len", "../empty_gym"), Cint, ())
    arr_high = Array{Cdouble}(undef, action_len)
    arr_low = Array{Cdouble}(undef, action_len)
    new_action_len = ccall(("get_action_space", "../empty_gym"), Cint, (Ptr{Cdouble}, Ptr{Cdouble}, Cint),
        arr_high, arr_low, action_len)
    # I don't know what to do here to create a thing to return
end

What should I be doing to close the loop for the last line? [Obligatory “this is my first time using Julia; I’ve been using C for several decades, python for about 3 years, and Julia for about an hour and a half”]

Thank you !
Gary

chunky · September 12, 2022, 2:40am

I figured it out by reading pendulum code. It uses IntervalSets, so my new RLBase.action_space function is:

using IntervalSets

function RLBase.action_space(env::DLLGymEnv)
    action_len = ccall(("get_action_len", "../empty_gym"), Cint, ())
    arr_high = Array{Cdouble}(undef, action_len)
    arr_low = Array{Cdouble}(undef, action_len)
    new_action_len = ccall(("get_action_space", "../empty_gym"), Cint, (Ptr{Cdouble}, Ptr{Cdouble}, Cint),
        arr_low, arr_high, action_len)
    Space([(arr_low[i] .. arr_high[i]) for i in 1:action_len])
end

Cheers
Gary

chunky · September 13, 2022, 6:46am

My code, as it stands, is working for my needs; I’ve pushed it here, for anyone in future with similar questions: GitHub - chunky/julia_gym_dll: Julia glue for a Gym written in C/C++

Cheers,
Gary