Hello,
I am trying to produce an example that shows the subtle differences in writing code in a distributed setting. Here is a contrived example:
function bla1(rho,nc1)
t1=@fetchfrom workers()[1] Sys.free_memory()
pids= workers()[1:nc1]
ss=zeros(nc1)
@sync for i::Int in 1:nc1
@async ss[i]=@fetchfrom pids[i] s=sum(view(rho,1:1,1:1)).*myid()
end
t2=@fetchfrom workers()[1] Sys.free_memory()
@show (t1-t2)/2^30
sum(ss)
end
function bla2(rho,nc1)
t1=@fetchfrom workers()[1] Sys.free_memory()
ss=zeros(nc1)
pids= workers()[1:nc1]
@sync for i::Int in 1:nc1
rhoI=view(rho,1:1,1:1)
@async ss[i]=@fetchfrom pids[i] sum(rhoI).*myid()
end
t2=@fetchfrom workers()[1] Sys.free_memory()
@show (t1-t2)/2^30
sum(ss)
end
The punchline is that bla1
sends the whole array rho
, and bla2
only sends one entry. I can show that effect with:
@everywhere GC.gc_enable(false) #scary
rho=rand(10^7)
@time bla1(rho,3)
@tme bla2(rho,3)
(t1 - t2) / 2 ^ 30 = 0.22303009033203125
0.131072 seconds (506 allocations: 32.297 KiB)
(t1 - t2) / 2 ^ 30 = 0.0
0.002824 seconds (526 allocations: 32.016 KiB)
I feel that my solution is rather hacky, especially turning off garbage collection. What is the “Julian” way of doing it? Couldn’t figure it out with @time,@allocated
…
Thanks!