Sequencing for distributed for-loop

parallel

#1

Is it possible to make @distributed do the work and respect the original order?

using Distributed
addprocs(2)
process_dates = range(Date(2018, 12, 20), stop = Date(2018, 12, 1), step = -Dates.Day(1))
    @sync @distributed for i in process_dates
        println(i)
        sleep(1)
    end
end

It goes like this:

|      From worker 3:|2018-12-10|
|      From worker 2:|2018-12-20|
|      From worker 3:|2018-12-09|
|      From worker 2:|2018-12-19|
|      From worker 3:|2018-12-08|
|      From worker 2:|2018-12-18|
|      From worker 3:|2018-12-07|
|      From worker 2:|2018-12-17|
|      From worker 3:|2018-12-06|
|      From worker 2:|2018-12-16|
|      From worker 3:|2018-12-05|
|      From worker 2:|2018-12-15|
|      From worker 3:|2018-12-04|
|      From worker 2:|2018-12-14|
|      From worker 3:|2018-12-03|
|      From worker 2:|2018-12-13|
|      From worker 3:|2018-12-02|
|      From worker 2:|2018-12-12|
|      From worker 3:|2018-12-01|
|      From worker 2:|2018-12-11|

However, I would rather see it goes like this:

|      From worker 3:|2018-12-19|
|      From worker 2:|2018-12-20|
|      From worker 3:|2018-12-17|
|      From worker 2:|2018-12-18|
|      From worker 3:|2018-12-15|
|      From worker 2:|2018-12-16|
...

#2

I don’t think you can control this directly:

julia> @macroexpand @sync @distributed for i ∈ 1:10; print(i); end

begin
    #= task.jl:246 =#
    let ##sync#72 = (Base.Any)[]
        #= task.jl:247 =#
        #154#v = begin
                #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:337 =#
                local #155#ref = (Distributed.pfor)(begin
                                #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:289 =#
                                function (#156#R, #157#lo::Distributed.Int, #158#hi::Distributed.Int)
                                    #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:290 =#
                                    for i = #156#R[#157#lo:#158#hi]
                                        #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:291 =#
                                        begin
                                            #= C:\Users\Nils-Holger\Desktop\KoalaTest.jl:28 =#
                                            print(i)
                                        end
                                    end
                                end
                            end, 1:10)
                #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:338 =#
                if $(Expr(:isdefined, Symbol("##sync#72")))
                    #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:339 =#
                    (Distributed.push!)(##sync#72, #155#ref)
                end
                #= C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\macros.jl:341 =#
                #155#ref
            end
        #= task.jl:248 =#
        (Base.sync_end)(##sync#72)
        #= task.jl:249 =#
        #154#v
    end
end

So this uses Distributed.pfor, in stdlib\v0.7\Distributed\src\macros.jl:

function pfor(f, R)
    @async @sync for c in splitrange(length(R), nworkers())
        @spawn f(R, first(c), last(c))
    end
end

so in my example splitrange(length(1:10), 2) would be called, returning a vector holding [1:5, 6:10], and worker 2 then gets allocated 1:5, worker 3 gets 6:10 and the output therefore alternates 1,6,2,7,...

One way of “preserving” the natural order would be to make the range going into the loop such that it alternates, i.e. vcat(collect(1:2:9), collect(2:2:10)) instead of simply 1:10 - although I think you still might end up with worker 3 starting first, and therefore an output of 2,1,4,3,... rather than 1,2,3,4,...