Error using GLM and Distributed

Dear all,

I am trying to parallelise my code and I find this kind of problem.

My not-parallelised code can be represented as:

using DataFrames, GLM

function foo(df::DataFrame)
    ols = lm(@formula(A ~ 1.0), df)
    return coef(ols)
end

df = DataFrame(A  = 1.0:50.0)
foo(df)

I parallelise by doing:

using Distributed
@everywhere function foo(df::DataFrame)
    ols = lm(@formula(A ~ 1.0), df)
    return coef(ols)
end

I get this error:

On worker 2:
LoadError: UndefVarError: @formula not defined
top-level scope
eval at ./boot.jl:331
#103 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:290
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:88
#96 at ./task.jl:356
in expression starting at /home/arnau/Dropbox/black_white_transitions/revision/New_Model/Tests.jl:40

...and 3 more exception(s).

sync_end(::Channel{Any}) at task.jl:314
macro expansion at task.jl:333 [inlined]
remotecall_eval(::Module, ::Array{Int64,1}, ::Expr) at macros.jl:218
top-level scope at macros.jl:202
include_string(::Function, ::Module, ::String, ::String) at loading.jl:1088

Any ideas? Thank you.

@everywhere using GLM

2 Likes

I’ve tried that. The problem is still there.

julia> using Distributed

julia> addprocs()
12-element Array{Int64,1}:
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15

julia> using DataFrames, GLM

julia> @everywhere function foo(df::DataFrame)
           ols = lm(@formula(A ~ 1.0), df)
           return coef(ols)
       end
ERROR: On worker 2:
LoadError: UndefVarError: @formula not defined
...

julia> @everywhere using DataFrames, GLM

julia> @everywhere function foo(df::DataFrame)
           ols = lm(@formula(A ~ 1.0), df)
           return coef(ols)
       end

julia>

Working.

3 Likes