How to design distributed (CPU/GPU) reinforcement learning methods?

jbrea · February 7, 2019, 10:17am

Question to the experts on distributed computing:
How should we design distributed reinforcement learning methods in julia?
Ray has a nice approach with different optimizers for methods that move around gradients or batches of experience, syncronoulsy or asyncronously.

As a super simple toy example of a method where agents collect experience on their own threads, send them to the learner and receive the updated policy I wrote the following:

using Distributed
@everywhere begin
    function actor(in, out)
        while true
            isdone, policy = take!(in)
            action = randn() + policy
            reward = 2 < action < 3
            put!(out, (myid(), action, reward))
            isdone && return
        end
    end
end

function learner(; policyinit = 0., η = 1e-2, T = 10^4)
    in = [RemoteChannel(()->Channel{Tuple{Bool,Float64}}(1)) for _ in 1:nprocs() - 1]
    out = [RemoteChannel(()->Channel{Tuple{Int,Float64,Bool}}(1)) for _ in 1:nprocs() - 1]
    for i in 1:nprocs() - 1
        remote_do(actor, i+1, in[i], out[i])
    end
    policy = policyinit
    for step in 1:T
        for i in 1:nprocs() - 1
            put!(in[i], (step == T, policy))
            id, action, reward = take!(out[i])
            policy += η * reward * (action - policy)
        end
    end
    policy
end

learner()

Does this look like a reasonable approach?
This works for CPUs. Can I follow a similar pattern with GPUs?
I have no clue about the performance of RemoteChannel. Do you think it will be possible to get competitive performance with this approach as compared e.g. to ray?

Topic		Replies	Views
Julia Distributed, AllReduce, and Distributed Training General Usage parallel , distributed , machine-learning	3	739	February 26, 2021
Distributed, AllReduce, and Distributed Training Julia at Scale parallel , distributed , machine-learning	1	585	March 18, 2021
Training Deep Neural Network using Data Parallel? Machine Learning parallel , flux	5	1317	January 24, 2022
Difference between JuliaGPU and Distributed.jl New to Julia	2	219	December 11, 2023
Distributed.jl vs MPI.jl Performance question , package , mpi , distributed	26	6455	January 31, 2022

How to design distributed (CPU/GPU) reinforcement learning methods?

Related topics