Trying to write a parallel for loop in Julia

I’m new to both Julia and parallel computing and am currently trying to convert some MATLAB code that uses a parfor loop. I’m struggling with how to do this in Julia. I have read that pmap is what I need, as I have a parfor loop that does a large calculation a few times (say 150). I can make this work by adding Threads.@threads before the for loop, and this feels very familiar, but is it optimal? I suspect not. Can anybody give me some clue how to do this? I’m really surprised that I couldn’t find a simple explanation anywhere.

A simplified example of what I’m doing is below. In reality, I have more parameters than just ‘parameter’ that I need to pass, and the operation on the array is more complex than just making a random array.

function Test()
    Z = zeros(Float64, 150, 10, 10)
    for p = 1:150
        parameter = rand(Float64, 10, 10)
        Z[p, :, :] = Test0(p, parameter)
    end
    return Z
end
function Test0(index, parameter)
    Z0 = parameter
end

From memory (haven’t used it since 2014b), parfor in MATLAB has a worker pool, so essentially different MATLAB processes, rather than threads, so the closer Julia equivalent would be a @distributed for loop.

I recently answered a question on StackOverflow related to this here which might be of interest. Generally the most efficient way of parallelising will depend quite heavily on your actual problem (complexity of calculations, uniformity of time taken for individual chunks of the calculation, required memory access patterns, system that you’re looking to run the code on), so it’s hard to comment based on very abstract examples.

Thanks very much. Really helpful. I tried it on the test problem below and distributed was much better. Hopefully it will be much better on my real code as well.

function Test()
    Z = zeros(Float64, 150, 10, 10)

    Threads.@threads for p = 1:150
        parameter = rand(Float64, 10, 10)
        Z[p, :, :] = Test0(parameter)
    end

    return Z
end

function Test0(parameter)
    Z0 = zeros(Float64,10,10)
    for k = 1:25000
        Z0 += parameter
    end
    return Z0
end

which is almost what I had before, with an extra loop inside to make it take more time, and

using SharedArrays, Distributed
@everywhere begin
    function Test20(parameter)
        Z0 = zeros(Float64, 10, 10)
        @sync @distributed for k = 1:25000
            Z0 += parameter
        end
        return Z0
    end
end

function Test2()

    Z = SharedArray{Float64,3}(150, 10, 10)

    @sync @distributed for p = 1:150
        parameter = rand(Float64, 10, 10)
        Z[p, :, :] = Test0(parameter)
    end

    return Z
end

which gives

julia> @time Z = Test2();
2.061500 seconds (3.76 M allocations: 3.130 GiB, 67.12% gc time)

julia> @time Z = Test();
16.502631 seconds (3.75 M allocations: 3.130 GiB, 95.21% gc time)

2 Likes