Hi,
I read the section Data Movement and Global variables to understand when data is being transferred between processes. Still, I need some help here.
I try to define a global data on the main process, send this data to remote processes via remotecall_retch
, update data on the main process, and finally use remotecall_fetch
to see if data is resent.
The example code and output on my machine:
Example 1: Reassign the entire array object causes data movement
using Distributed
A = [1,2,3] # [1, 2, 3]
remotecall_fetch(()->sum(A), 2) # 6
A = [4,5,6] # [4, 5, 6]
remotecall_fetch(()->sum(A), 2) # 15
This is the same as expected: data is captured and sent to worker 2.
Example 2: Resent also happens if the value of array rather than its pointer is changed:
using Distributed
A = [1,2,3] # [1, 2, 3]
remotecall_fetch(()->sum(A), 2) # 6
println(pointer_from_objref(A)) # Ptr{Nothing} @0x00007f9ef00de380
@. A += 5 # [6, 7, 8]
println(pointer_from_objref(A)) # Ptr{Nothing} @0x00007f9ef00de380
remotecall_fetch(()->sum(A), 2) # 21
According to the doc “Globals are re-sent to a destination worker only in the context of a remote call, and then only if its value has changed. Also, the cluster does not synchronize global bindings across nodes.” still makes sense if I interpret “value has changed” as “any changes including the value of array”.
Example 3: Change the value of a member of a user-defined object does not cause the resent of data:
using Distributed
@everywhere mutable struct X
a
end
x = X([1,2,3]) # X([1, 2, 3])
remotecall_fetch(()->sum(x.a), 2) # 6
@. x.a += 5 # [6, 7, 8]
remotecall_fetch(()->sum(x.a), 2) # 6
sum(x.a) # 21
A clear statement of my problem : what does it mean by “value has changed” in the doc that causes data resent?
My julia version is 1.1.1
If this is a duplication please let me know too.