Julia has been good in my parallel computing works, however I encounter an issue recently. Let’s say I have an array
a and I would like a
@view as a shortcut to modify or refer to part of
a, so I make a data structure about this, e.g.
@everywhere struct MyArray
MyArray(a::Vector) = new(a, @view a[1:2])
Now I would like to create an array, launch a parallel computing on some workers with this array and calculate some stuff:
m = MyArray([1., 2., 3., 4.])
future = @spawnat 2 (m.b .= 0; sum(m.a))
remoteResult = fetch(future)
localResult = (m.b .= 0; sum(m.a))
But when I run these code I get
remoteResult = 10.0 and
localResult = 7.0. It seems that the relation between a
SubArray and an
Array is removed after they are transferred to a worker.
I would like to know whether (1) this is an inconsistency in Julia
Distributed, or (2) I am using parallel features improperly? If (2), what should I do instead?
I don’t have the answer but conceptually, modifying a portion of an array on different workers and expecting that the local array will be modified by what the remote machine did doesn’t make sense to me.
what you should do instead is split up your work and return the sub-result from each worker and then reduce the various results together at the end. if the result is a sum, do a sum of the first half on one machine and the sum of the second half on the other… then when the result comes back… add them together
It appears that views do not get propagated to workers.
julia> m = MyArray([1., 2., 3., 4.])
MyArray([1.0, 2.0, 3.0, 4.0], [1.0, 2.0])
julia> @fetchfrom 2 m
MyArray([1.0, 2.0, 3.0, 4.0], [0.0, 0.0])
This may be because the array
m.a gets copied when it is passed to a worker, and the view on the worker has no reference to the copy.
julia> m.a === @fetchfrom 2 m.a
julia> @fetchfrom 2 parent(m.b)
One way around this would be to initiate the view on the worker.
julia> remoteResult = @fetchfrom 2 (p = MyArray(m.a); p.b .= 0; sum(p.a))
I certainly do not expect some modification happened remotely should affect a local array; what is surprising here is: when I copy both an array and its view to a remote machine, the relation between them seems broken.
This workaround seems perfect. The rule of thumb here is going to be: create and use a view on the same machine.
Although I think Julia stdlib should have implemented this in a more consistent way…