What causes data movement when using remotecall_fetch?

Hi,

I read the section Data Movement and Global variables to understand when data is being transferred between processes. Still, I need some help here.

I try to define a global data on the main process, send this data to remote processes via remotecall_retch, update data on the main process, and finally use remotecall_fetch to see if data is resent.

The example code and output on my machine:

Example 1: Reassign the entire array object causes data movement

using Distributed
A = [1,2,3]                     # [1, 2, 3]
remotecall_fetch(()->sum(A), 2) # 6
A = [4,5,6]                     # [4, 5, 6]
remotecall_fetch(()->sum(A), 2) # 15

This is the same as expected: data is captured and sent to worker 2.

Example 2: Resent also happens if the value of array rather than its pointer is changed:

using Distributed
A = [1,2,3]                     # [1, 2, 3]
remotecall_fetch(()->sum(A), 2) # 6
println(pointer_from_objref(A)) # Ptr{Nothing} @0x00007f9ef00de380
@. A += 5                       # [6, 7, 8]
println(pointer_from_objref(A)) # Ptr{Nothing} @0x00007f9ef00de380
remotecall_fetch(()->sum(A), 2) # 21

According to the doc “Globals are re-sent to a destination worker only in the context of a remote call, and then only if its value has changed. Also, the cluster does not synchronize global bindings across nodes.” still makes sense if I interpret “value has changed” as “any changes including the value of array”.

Example 3: Change the value of a member of a user-defined object does not cause the resent of data:

using Distributed
@everywhere mutable struct X
    a
end
x = X([1,2,3])                    # X([1, 2, 3])
remotecall_fetch(()->sum(x.a), 2) # 6
@. x.a += 5                       # [6, 7, 8]
remotecall_fetch(()->sum(x.a), 2) # 6
sum(x.a)                          # 21

A clear statement of my problem : what does it mean by “value has changed” in the doc that causes data resent?

My julia version is 1.1.1

If this is a duplication please let me know too.

1 Like

The answer is that Julia uses hash to determine whether the value has changed, and the default hash doesn’t recurse into your object’s fields, it just uses objectid(x). I think there’s some discussion of this in various threads, but I agree, its mildly surprising.

In any case, you can fix it by defining your own hash which does involve the contents, e.g. something like:

Base.hash(x::X, h::UInt) = hash(X, hash(x.a, h))

After that, your last example works as expected.

4 Likes

Thank you. This is exactly what I am looking for.