I’m new to parallel computing (and Julia as well for that matter). I have a very simple code that works perfectly in the regular un-parallel set-up, but that doesn’t work when distributed across several workers. That is : the program performs without any errors being raised, but the yielding result is erroneous (to be more precise : it blows up instead of converging). The (mathematical) problem is very simple : I run Monte-Carlo simulation on a grid of some function depending on a set of weights (i.e. the array \nu). Depending on the result of this simulation, I update the weights and so on, until the weights converge. Here’s the code:
using Distributed
@everywhere begin
using LinearAlgebra
using SharedArrays
const N = 5
const β = 1.0
const M = 20
const Δ = N/(2M)
ν = SharedArray{Float64,1}(M)
W(r) = - (r >= 0.5) * r + (r <= -0.5) * r + (r >= -0.5)*(r <= 0.5)*(-r^2 - 1/4)
U(r) = Δ^2 * dot(ν, map(t -> W(r/Δ - (t - 1/2)) + W(r/Δ + (t - 1/2)), 1:M))
end
function ρ_ν(r, niter = 15_000)
res = @sync @distributed (+) for _ = 1:niter
sample = N * (rand(N)-1/2*ones(N)); sample[1] = r
ρ = prod(map(t->exp(β*U(t)),sample))
for k = 1:N-1 for q = k+1:N
ρ *= exp(β * abs(sample[k] - sample[q]))
end end
ρ
end
N^(N-1) * res/niter
end
function PGD(ε)
err = ε + 1
iter = 1
λ(t) = 10/t^(1/4)
while err > ε
err = 0
ρ = map(t->ρ_ν((t-1/2)*Δ),1:M)
ρ = ρ./(2*Δ*sum(ρ))
for i = 1:M
η = Δ^3 * sum(map(t->W((t - 1/2) - (i - 1/2)) + W(-(t-1/2)-(i-1/2)), 1:M).*(ones(M).- N*ρ))
print("\n")
print("eta =", η)
print("\n")
ν[i] += λ(iter) * η
err += abs(η)
end
iter += 1
print(ν, '\n')
end
end
When I run the optimizing function PGD in Juno (without adding workers or whatsoever), everything works perfectly fine. Now, when I launch the program in a parallel set-up, it doesn’t work (even with just one processor, that “Julia -p 1 …”). In the parallel setting, \eta is always positive, making the weights blow up…