Addprocs() to different achitectures


#1

Hi,

I’m experimenting with doing remote calls to different architectures (beaglebone and Raspberry Pi). I have successfully done a remotecall_fetch() to a UDOO x86 running Mint Linux from my Dell Optiplex 755 desktop. The Udoo and Dell both have x86 architectures. I have passwordless ssh set up on the different machines.

Here is the results of doing a remote to the Udoo x86.

julia> addprocs([“julia-user@NODE-UDOOX86”],dir="/home/julia-user/julia-0.6.0/bin/")
1-element Array{Int64,1}:
2

julia> remotecall_fetch(rand,2,20)
20-element Array{Float64,1}:
0.483415
0.0938163
0.287159
0.574149
0.0432192
0.842411
0.872008
0.413705
0.0872259
0.393199
0.655173
0.800151
0.842092
0.0810006
0.753906
0.234955
0.738402
0.316464
0.856606
0.854614

I have Julia 0.6.0 running on a Beaglebone Black, but when I try to do an addprocs() to remote access the beaglebone, here’s what I get:

julia> addprocs([“julia-user@NODE-BBB”],dir="/home/julia-user/julia-0.6.0/bin/")

BoundsError(Any[Symbol, Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Int128, UInt128, Float16, Float32, Float64, Char, DataType, Union, UnionAll, TypeName, Tuple, Array, Expr, LineNumberNode, LabelNode, GotoNode, QuoteNode, CodeInfo, TypeVar, Core.Box, Core.MethodInstance, Module, Task, String, SimpleVector, Method, GlobalRef, SlotNumber, TypedSlot, NewvarNode, SSAValue, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, (), Bool, Any, Union{}, Core.TypeofBottom, Type, svec(), Tuple{}, false, true, nothing, :Any, :Array, :TypeVar, :Box, :Tuple, :Ptr, :return, :call, :(::), :Function, :(=), :(==), :(===), :gotoifnot, :A, :B, :C, :M, :N, :T, :S, :X, :Y, :a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o, :p, :q, :r, :s, :t, :u, :v, :w, :x, :y, :z, :add_int, :sub_int, :mul_int, :add_float, :sub_float, :new, :mul_float, :bitcast, :start, :done, :next, :indexed_next, :getfield, :meta, :eq_int, :slt_int, :sle_int, :ne_int, :push_loc, :pop_loc, :pop, :arrayset, :arrayref, :apply_type, :inbounds, :getindex, :setindex!, :Core, :!, :+, :Base, :static_parameter, :convert, :colon, Symbol("#self#"), Symbol("#temp#"), :tuple, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], (0,))CapturedException(BoundsError(Any[Symbol, Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Int128, UInt128, Float16, Float32, Float64, Char, DataType, Union, UnionAll, TypeName, Tuple, Array, Expr, LineNumberNode, LabelNode, GotoNode, QuoteNode, CodeInfo, TypeVar, Core.Box, Core.MethodInstance, Module, Task, String, SimpleVector, Method, GlobalRef, SlotNumber, TypedSlot, NewvarNode, SSAValue, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, (), Bool, Any, Union{}, Core.TypeofBottom, Type, svec(), Tuple{}, false, true, nothing, :Any, :Array, :TypeVar, :Box, :Tuple, :Ptr, :return, :call, :(::), :Function, :(=), :(==), :(===), :gotoifnot, :A, :B, :C, :M, :N, :T, :S, :X, :Y, :a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o, :p, :q, :r, :s, :t, :u, :v, :w, :x, :y, :z, :add_int, :sub_int, :mul_int, :add_float, :sub_float, :new, :mul_float, :bitcast, :start, :done, :next, :indexed_next, :getfield, :meta, :eq_int, :slt_int, :sle_int, :ne_int, :push_loc, :pop_loc, :pop, :arrayset, :arrayref, :apply_type, :inbounds, :getindex, :setindex!, :Core, :!, :+, :Base, :static_parameter, :convert, :colon, Symbol("#self#"), Symbol("#temp#"), :tuple, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], (0,)), Any[(handle_deserialize(::Base.Distributed.ClusterSerializer{TCPSocket}, ::Int32) at serialize.jl:659, 1), (deserialize_msg(::Base.Distributed.ClusterSerializer{TCPSocket}) at messages.jl:98, 1), (message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at process_messages.jl:148, 1), (process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at process_messages.jl:118, 1), ((::Base.Distributed.##99#100{TCPSocket,TCPSocket,Bool})() at event.jl:73, 1)])
Process(1) - Unknown remote, closing connection.
Worker 3 terminated.
ERROR (unhandled task failure): Version read failed. Connection closed by peer.
Stacktrace:
[1] process_hdr(::TCPSocket, ::Bool) at ./distributed/process_messages.jl:257
[2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:143
[3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
[4] (::Base.Distributed.##99#100{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
Master process (id 1) could not connect within 60.0 seconds.
exiting.

The directory structures are identical on both the dev boards (BBB and Udoo x86) and I can passwordless ssh into both boards.

Does anyone have any ideas? I thought you could do this in Julia regardless of the architecture.

Frank

added: 8/18/17 14:42

I just finished trying to do the addprocs() to a Raspberry Pi 2 and got the same error as I did for the Beaglebone Black.

julia> addprocs([“julia-user@NODE-RPI2”],dir="/home/julia-user/julia-0.6.0/bin/")
BoundsError(Any[Symbol, Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Int128, UInt128, Float16, Float32, Float64, Char, DataType, Union, UnionAll, TypeName, Tuple, Array, Expr, LineNumberNode, LabelNode, GotoNode, QuoteNode, CodeInfo, TypeVar, Core.Box, Core.MethodInstance, Module, Task, String, SimpleVector, Method, GlobalRef, SlotNumber, TypedSlot, NewvarNode, SSAValue, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, (), Bool, Any, Union{}, Core.TypeofBottom, Type, svec(), Tuple{}, false, true, nothing, :Any, :Array, :TypeVar, :Box, :Tuple, :Ptr, :return, :call, :(::), :Function, :(=), :(==), :(===), :gotoifnot, :A, :B, :C, :M, :N, :T, :S, :X, :Y, :a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o, :p, :q, :r, :s, :t, :u, :v, :w, :x, :y, :z, :add_int, :sub_int, :mul_int, :add_float, :sub_float, :new, :mul_float, :bitcast, :start, :done, :next, :indexed_next, :getfield, :meta, :eq_int, :slt_int, :sle_int, :ne_int, :push_loc, :pop_loc, :pop, :arrayset, :arrayref, :apply_type, :inbounds, :getindex, :setindex!, :Core, :!, :+, :Base, :static_parameter, :convert, :colon, Symbol("#self#"), Symbol("#temp#"), :tuple, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], (0,))CapturedException(BoundsError(Any[Symbol, Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Int128, UInt128, Float16, Float32, Float64, Char, DataType, Union, UnionAll, TypeName, Tuple, Array, Expr, LineNumberNode, LabelNode, GotoNode, QuoteNode, CodeInfo, TypeVar, Core.Box, Core.MethodInstance, Module, Task, String, SimpleVector, Method, GlobalRef, SlotNumber, TypedSlot, NewvarNode, SSAValue, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, Symbol, (), Bool, Any, Union{}, Core.TypeofBottom, Type, svec(), Tuple{}, false, true, nothing, :Any, :Array, :TypeVar, :Box, :Tuple, :Ptr, :return, :call, :(::), :Function, :(=), :(==), :(===), :gotoifnot, :A, :B, :C, :M, :N, :T, :S, :X, :Y, :a, :b, :c, :d, :e, :f, :g, :h, :i, :j, :k, :l, :m, :n, :o, :p, :q, :r, :s, :t, :u, :v, :w, :x, :y, :z, :add_int, :sub_int, :mul_int, :add_float, :sub_float, :new, :mul_float, :bitcast, :start, :done, :next, :indexed_next, :getfield, :meta, :eq_int, :slt_int, :sle_int, :ne_int, :push_loc, :pop_loc, :pop, :arrayset, :arrayref, :apply_type, :inbounds, :getindex, :setindex!, :Core, :!, :+, :Base, :static_parameter, :convert, :colon, Symbol("#self#"), Symbol("#temp#"), :tuple, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, :reserved, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], (0,)), Any[(handle_deserialize(::Base.Distributed.ClusterSerializer{TCPSocket}, ::Int32) at serialize.jl:659, 1), (deserialize_msg(::Base.Distributed.ClusterSerializer{TCPSocket}) at messages.jl:98, 1), (message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at process_messages.jl:148, 1), (process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at process_messages.jl:118, 1), ((::Base.Distributed.##99#100{TCPSocket,TCPSocket,Bool})() at event.jl:73, 1)])
Process(1) - Unknown remote, closing connection.
Worker 2 terminated.
ERROR (unhandled task failure): Version read failed. Connection closed by peer.
Stacktrace:
[1] process_hdr(::TCPSocket, ::Bool) at ./distributed/process_messages.jl:257
[2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:143
[3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
[4] (::Base.Distributed.##99#100{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
Master process (id 1) could not connect within 60.0 seconds.
exiting.

Here’s some info. about the RPi2:

uname -a
Linux NODE-RPI2 4.9.35-v7+ #1014 SMP Fri Jun 30 14:47:43 BST 2017 armv7l GNU/Linux

lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description: Raspbian GNU/Linux 8.0 (jessie)
Release: 8.0
Codename: jessie

Info. for Beaglebone Black:

uname -a
Linux NODE-BBB 4.4.80-ti-r116 #1 SMP Wed Aug 9 15:33:47 UTC 2017 armv7l GNU/Linux

lsb_release -r
Release: 9.1 [debian]


#2

On a hunch, and seeing in the error message things about Uint64 and Float64 - I wondered if had to do with 32 bit versus 64 bit installations. The ARM installations were 32 bit and x86 installations were 64-bit. So, I downloaded the 32 bit version and installed on my Dell (it’s the master - so to speak). I started up the 32 bit version of Julia and here’s what I got!!

For the Raspberry Pi 2:

julia> addprocs([“julia-user@NODE-RPI2”],dir="/home/julia-user/julia-0.6.0/bin/")
1-element Array{Int32,1}:
2

julia> remotecall_fetch(rand, 2, 20)
20-element Array{Float64,1}:
0.438085
0.904493
0.973361
0.679264
0.738671
0.822397
0.175776
0.638466
0.109121
0.234268
0.194145
0.00515439
0.46107
0.0896276
0.367764
0.642768
0.88751
0.835329
0.980536
0.60624

and, the Beaglebone Black:

julia> addprocs([“julia-user@NODE-BBB”],dir="/home/julia-user/julia-0.6.0/bin/")

1-element Array{Int32,1}:
3

julia> remotecall_fetch(rand, 3, 20)
20-element Array{Float64,1}:
0.421401
0.697546
0.92379
0.901786
0.754601
0.237044
0.687448
0.223037
0.637141
0.300794
0.526395
0.582173
0.019924
0.584588
0.470795
0.024546
0.995812
0.0406491
0.928998
0.922926