Hi everyone,
This is an edit of a previous question, as I sense that I wasn’t clear enough. I’m taking examples from the distributed computing documentation version 1.7.3 : Multi-processing and Distributed Computing · The Julia Language
As someone who learned distributed programming with MPI, I’ve learned in particular that it was crucial to avoid a maximum of communication between processes. That’s why I’m confused by the one-sided high-level philosophy of Julia.
In the following example, the tutorial shows us how to declare a random matrix on the process 2 and how to add 1 to every entries.
$ julia -p 2
julia> r = remotecall(rand, 2, 2, 2)
Future(2, 1, 4, nothing)
julia> s = @spawnat 2 1 .+ fetch(r)
Future(2, 1, 5, nothing)
julia> fetch(s)
2×2 Array{Float64,2}:
1.18526 1.50912
1.16296 1.60607
I’m incredibly confused by this example. To me, it seems like :
- Process 2 has no identifier referencing to the matrix that was created on it. The only reference to this matrix is the Future on master process.
- Process 2 sends that matrix to master process
- Process 1 adds one to every entry of that matrix
- The result lies in process 1, it doesn’t exist on process 2
To me, it would’ve been more logical to
- Declare a matrix on process 2
- Tell process 2 to add one to every entry of that matrix
Can you tell me exactly what is going on, in terms of message passing, on that example? If it does what I think it does, then is Julia suited to write communication-optimized programs? If it does what I think it should do, then I would desperately need some explanation on the functioning of master process. Is it a virtual process, consisting in a high-level layer of references, that doesn’t perform useless communications, and allows us to not worry about it?
Another example of my making. Consider the following code :
$ julia -p 2
A = randn(16,16)
x = randn(16)
y = randn(16)
horizontalSlices = [1:8, 9:16]
Ax = fetch.( [@spawnat i+1 A[horizontalSlices[i], :] * x for i = 1:2 ] )
Ay = fetch.( [@spawnat i+1 A[horizontalSlices[i], :] * y for i = 1:2 ] )
To me, it seems that this code will scatter A to the workers twice, unless the master process has a high-level layer.
Thank you in advance
Best regards