Hello I have a question regarding distributed programming with callable mutable types and how they are handled by workers when used in a @distributed loop. In particular if I have an instance of a callable mutable type (or a mutable “functor” as mentioned in the docs) and a function that passes it into a distributed loop, each worker will receive a copy of the instance and the method tied to it on the function call. Now if I call the function again and pass the same instance of the struct to it (after potentially mutating the values contained within), will a new instance be copied over to each worker, as well as a new function definition or will the old ones passed in the previous call be updated? In other words will my workers have two copies of the object or just one? The situation I described would be the following:
using Distributed
addprocs(9)
@everywhere mutable struct myStruct
x::Array{Float64}
end
@everywhere (S::myStruct)(i::Int) = S.x[i] + 1
function addone_distributed(s:myStruct)
@distributed for i=1:9
ans = s(i)
end
end
sinst = myStruct(zeros(3,3))
addone_distributed(sinst)
sinst.x = ones(3,3)
addone_distributed(sinst)
I’m a little unsure how to check all the variables that exist in the namespace of worker threads, so any help is much appreciated.
@distributed does not create global bindings for you automatically, so everytime @distributed is called, it serializes a copy of s to each worker, and that is what will be operated on.
If you want a single myStruct object per-worker, you’ll need to do something like @everywhere const s = myStruct(zeros(3,3)), and then do something like global s; ans = s(i) in your @distributed loop.
Alternatively, if you don’t want to deal with these annoying remote global-bindings that Distributed encourages you to use, you can use Dagger and do s = Dagger.@shard myStruct(zeros(3,3)) to create a myStruct everywhere, and then operate on it asynchronously. We don’t have an equivalent to @distributed, but you can quite easily write your own.
Thanks for the response, that definitely gives me some insight into how @distributed is working on the backend. As a follow up question, would this be the case for other variables defined within the loop, i.e. would multiple calls of the addone_distributed function where lets say an array is defined in the loop result in multiple arrays getting serialized and pushed to the workers? Could I expect the Garbage Collector to clean up these objects defined in the loop on the workers after the call is finished?
Thanks for the Dagger.jl suggestion though, my problems are all arising in the context of a program running computations on very high-dimensional matrices on an HPC and it seems like we’re running into a number of issues with global variables and lack of garbage collection. It looks like tasks like my own are one which Dagger.jl had in mind when it was designed
If the arrays are defined within the @distributed loop, they will be allocated directly on the workers (thus, not serialized to the workers), so you will be creating new arrays that the GC (Garbage Collector) will later clear up, assuming you do not keep a reference to those arrays somewhere else.
Feel free to reach out directly or start another Discourse thread to discuss how Dagger might be able to help you solve this problem - probably Dagger’s DArray would be what you want to use, but we’ll have to see how it behaves with higher-dimensional arrays.