Remote SharedArray problems

parallel
cluster

#1

I am trying to run this example but I get the following:

julia> remotes = addprocs([("user@remote_host", 4)])
4-element Array{Int64,1}:
 2
 3
 4
 5

julia> S = SharedArray{Int}((3,4), init = S -> S[localindexes(S)] = myid(), pids=remotes)
3×4 SharedArray{Int64,2}:
 2  3  4  5
 2  3  4  5
 2  3  4  5

julia> procs(S)
4-element Array{Int64,1}:
 2
 3
 4
 5

julia> r = @spawnat remotes[1] S*eye(4)
ERROR: BoundsError: attempt to access 0×0 Array{Int64,2} at index [1]
Stacktrace:
 [1] hash(::SharedArray{Int64,2}, ::UInt64) at ./abstractarray.jl:1950
 [2] hash(::SharedArray{Int64,2}) at ./hashing.jl:5
 [3] serialize_global_from_main(::Base.Distributed.ClusterSerializer{TCPSocket}, ::Symbol) at ./distributed/clusterserialize.jl:145
 [4] foreach(::Base.Distributed.##4#6{Base.Distributed.ClusterSerializer{TCPSocket}}, ::Array{Symbol,1}) at ./abstractarray.jl:1731
 [5] serialize(::Base.Distributed.ClusterSerializer{TCPSocket}, ::TypeName) at ./distributed/clusterserialize.jl:83
 [6] serialize_type_data(::Base.Distributed.ClusterSerializer{TCPSocket}, ::DataType) at ./serialize.jl:511
 [7] serialize_type(::Base.Distributed.ClusterSerializer{TCPSocket}, ::DataType, ::Bool) at ./serialize.jl:554
 [8] serialize_any(::Base.Distributed.ClusterSerializer{TCPSocket}, ::Any) at ./serialize.jl:615
 [9] serialize_msg(::Base.Distributed.ClusterSerializer{TCPSocket}, ::Base.Distributed.CallMsg{:call}) at ./distributed/messages.jl:89
 [10] send_msg_(::Base.Distributed.Worker, ::Base.Distributed.MsgHeader, ::Base.Distributed.CallMsg{:call}, ::Bool) at ./distributed/messages.jl:181
 [11] #remotecall#138 at ./distributed/remotecall.jl:325 [inlined]
 [12] remotecall(::Function, ::Base.Distributed.Worker) at ./distributed/remotecall.jl:324
 [13] #remotecall#139(::Array{Any,1}, ::Function, ::Function, ::Int64) at ./distributed/remotecall.jl:336
 [14] spawnat(::Int64, ::Function) at ./distributed/macros.jl:15

What am I doing wrong here?


#2

SharedArray means shared memory array, and is typically used when workers are on the same machine with access to shared memory.
If you’re trying this on a remote machine, it’s unlikely the machines share memory, and so it’s not going to work.

https://docs.julialang.org/en/stable/manual/parallel-computing/#man-shared-arrays-1

Depending on your use case, you might look at Distributed Arrays: https://github.com/JuliaParallel/DistributedArrays.jl


#3

Well sure, but I am not trying to have a SharedArray shared between different machines. I am trying to have a SharedArray with shared memory on the same machine, but one which is remote. Reading the example in the link I provided it seems that this should be possible to do. In particular:

“I take it your suggestion means that the SharedArray code is designed to be invoked from the master process…”
“The previous invocation, while a bit inefficient should have worked too.”

I interpret this that it should be perfectly ok to construct a SharedArray with a call on a local process, where the array will live on a remote machine, as long as all the pids in the call are on that same remote machine. I mean I am basically only calling the commands provided by Amit Murthy, who seems to be one of the main contributors to DistributedArrays.jl and parallel Julia in general.


#4

Ah, I can see what you’re trying to do now.

I can confirm that for me, this works on v0.5.1, but not on v0.6.0. Seems like a regression.

Error message suggests that serialize is trying to hash the shared array locally - same error as trying to access S locally, say by executing S[1].

Maybe worth raising an issue?


#5

Yes, seems like it is actually a bug then. Thanks for confirming!