I have a very weird bug that has just popped up. After adding my workers (through the SLURM workload manager - this part is good), I can’t seem to execute @everywhere using [packagename]
.
This never happened before. My code use to run fine just a month ago and now suddenly it has stopped working. I don’t think anything has changed in terms of the server configuration or Julia version, although I am not 100% sure about this.
Here are some code snippets.
Here are the workers:
julia> workers()
16-element Array{Int64,1}:
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Here are the errors:
julia> @everywhere using DataFrames
ERROR: On worker 2:
BoundsError: attempt to access 229-element Array{Any,1} at index [0]
handle_deserialize at ./serialize.jl:662
deserialize_msg at ./distributed/messages.jl:98
message_handler_loop at ./distributed/process_messages.jl:161
process_tcp_streams at ./distributed/process_messages.jl:118
#99 at ./event.jl:73
#remotecall_fetch#141(::Array{Any,1}, ::Function, ::Function, ::Base.Distributed.Worker, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:354
remotecall_fetch(::Function, ::Base.Distributed.Worker, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:346
#remotecall_fetch#144(::Array{Any,1}, ::Function, ::Function, ::Int64, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:367
remotecall_fetch(::Function, ::Int64, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:367
(::##14#16)() at ./distributed/macros.jl:102
...and 15 more exception(s).
Stacktrace:
[1] sync_end() at ./task.jl:287
[2] macro expansion at ./distributed/macros.jl:112 [inlined]
[3] anonymous at ./<missing>:?
[4] macro expansion at ./REPL.jl:97 [inlined]
[5] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73
or
julia> @everywhere using Parameters
ERROR: On worker 2:
BoundsError: attempt to access 229-element Array{Any,1} at index [0]
handle_deserialize at ./serialize.jl:662
deserialize_msg at ./distributed/messages.jl:98
message_handler_loop at ./distributed/process_messages.jl:161
process_tcp_streams at ./distributed/process_messages.jl:118
#99 at ./event.jl:73
#remotecall_fetch#141(::Array{Any,1}, ::Function, ::Function, ::Base.Distributed.Worker, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:354
remotecall_fetch(::Function, ::Base.Distributed.Worker, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:346
#remotecall_fetch#144(::Array{Any,1}, ::Function, ::Function, ::Int64, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:367
remotecall_fetch(::Function, ::Int64, ::Expr, ::Vararg{Expr,N} where N) at ./distributed/remotecall.jl:367
(::##22#24)() at ./distributed/macros.jl:102
...and 15 more exception(s).
Stacktrace:
[1] sync_end() at ./task.jl:287
[2] macro expansion at ./distributed/macros.jl:112 [inlined]
[3] anonymous at ./<missing>:?
[4] macro expansion at ./REPL.jl:97 [inlined]
[5] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73
But it seems that basic statements work fine
julia> @everywhere a = b = 10
julia>
I’ve gone through the Julia source code to debug but I can not figure out what 229-element Array
it is failing at.
Working with Julia 0.6.2.