 # Distributed loop and possible memory leak

Hello, I’m trying to distribute some computation in a for loop among my procs. Here is what I tried:

``````using Distributed
@everywhere begin
using LinearAlgebra
using SharedArrays
using BenchmarkTools
using Infiltrator
end

function testpar(r,K,H,Hold,WtX,WtW,LH)
for k=1:K
beta = 1/k
@sync @distributed for i=1:r
Hi = H[i,:]'
Hextra = beta*(Hi-Hold[i,:]')
Hold[i,:] = Hi
H[i,:] = max.(Hi + Hextra + (WtX[i,:]'-WtW[i,:]'*H)/LH[i],1e-16)
end
end
end

function testnopar(r,K,H,Hold,WtX,WtW,LH)
for k=1:K
beta = 1/k
for i=1:r
Hi = H[i,:]'
Hextra = beta*(Hi-Hold[i,:]')
Hold[i,:] = Hi
H[i,:] = max.(Hi + Hextra + (WtX[i,:]'-WtW[i,:]'*H)/LH[i],1e-16)
end
end
end

function main()
m = 162
n = 307*307
r = 6
K = 20
W = rand(m,r)
Hinit = rand(r,n)
H = copy(Hinit)
Hold = copy(H)
X = rand(m,n)
WtX = W'*X
WtW = W'*W
LH = diag(WtW)
@btime testnopar(\$r,\$K,\$H,\$Hold,\$WtX,\$WtW,\$LH)

H = SharedArray(copy(Hinit))
Hold = SharedArray(copy(H))
@btime testpar(\$r,\$K,\$H,\$Hold,\$WtX,\$WtW,\$LH)
end

main()
``````

and here is the output:

``````  250.167 ms (2760 allocations: 949.35 MiB)
259.098 ms (18263 allocations: 1.05 MiB)
``````

In my original code, the line H[i,:] = max.( … ) is the bottleneck because n is large. So I thought that it could be interesting to distribute it. In the end, in this simple test I don’t observe some gain in computation time in the distributed version, and I don’t understand why the allocated memory is that important (1GB) for the undistributed version. I’m probably doing lots of wrong things (coming from matlab) but I don’t know what.