Reduce allocations in @distributed for

xzackli · July 26, 2019, 2:39am

I’m doing some cheap operations on a set of huge SharedArrays, i.e. computing the distance from x,y,z.

using Distributed
addprocs(40);
@everywhere using SharedArrays

function compute_distance(result::SharedArray, pos::SharedArray)
    @sync @distributed for i = 1:size(pos,2)
        result[i] = sqrt(pos[1,i]^2 + pos[2,i]^2 + pos[3,i]^2)
    end
end

Is it always better to write this with a local variable, which I assume is broadcast to each worker process, to prevent allocations? For example,

function compute_distance(result::SharedArray, pos::SharedArray)
    temp = 0.0
    @sync @distributed for i = 1:size(input,2)
        result[i] = pos[1,i]^2
        temp = pos[2,i]^2
        result[i] += temp
        temp = pos[3,i]^2
        result[i] += temp
        result[i] = sqrt(result[i])
    end
end

This seems sort of ugly!

robsmith11 · July 26, 2019, 3:29am

Why are you using Distributed for such a cheap calculation? I doubt the benefit exceeds the overhead.

Threads should be more appropriate if you really want to run the distance calculations in parallel. (It may be worth trying the 1.3 alpha branch as there has been a lot of work on multithreading recently.)

EDIT:
Here’s an example of using broadcasting to get rid of allocations and then with 8 threads (broadcasting not needed since it uses an explicit loop):

julia> function f1(p,r)
        r .= sqrt.(view(p,:,1).^2 .+ view(p,:,2).^2 .+ view(p,:,3).^2)
       end
f1 (generic function with 1 method)

julia> function f2(p,r)
         Threads.@threads for i in 1:length(r)
           r[i] = sqrt(p[i,1]^2 + p[i,2]^2 + p[i,3]^2)
         end
       end
f2 (generic function with 1 method)

julia> @btime f1(p,r) evals=1 setup=(p=rand(10^8, 3); r=Vector
{Float64}(undef,10^8));
  248.940 ms (3 allocations: 144 bytes)

julia> @btime f2(p,r) evals=1 setup=(p=rand(10^8, 3); r=Vector
{Float64}(undef,10^8));
  117.666 ms (60 allocations: 6.03 KiB)

Topic		Replies	Views
Using Distributed: computational efficiency Julia at Scale	5	943	June 24, 2019
@distributed computing New to Julia	13	961	June 12, 2019
Multi-dimensional SharedArrays.jl don't work in @distributed loops Julia at Scale distributed	2	779	January 23, 2019
Memory allocation and SharedArrays Julia at Scale question	3	735	August 7, 2019
Getting started with distributed Julia computations on a cluster Julia at Scale	1	566	September 27, 2020

Reduce allocations in @distributed for

Related topics