I have a rather extensive and inaccessible algorithm that deploys distributed computing. To make it more accessible, I want to wrap as much of the algorithm into the functions of my package.
I manage to initialize my variables on a worker but cannot access them for subsequent processing (see example below). I understand that x is not defined in the right scope but is there a smart way to fix this?
using Distributed
addprocs(1)
@everywhere module myPackage
using Distributed
# function to create variable on worker
function createData()
@everywhere 2 begin
x = rand(10)
end
end
# function to execute on worker
function processData(y)
return x .* y
end
# function to be wrapped
function runProcessData(y)
return fetch(@spawnat 2 processData(x,y))
end
export createData, runProcessData
end
@everywhere using .myPackage
createData()
fetch(@spawnat 2 eval(:x))
runProcessData(10)
The only workaround for me was to add x as an input to runProcessData, fetch it from the worker, and pass it again, which is extremely inefficient in my case.
The @everywhere macro executes an expression in Main, so the x in createData exists in the Main module of process 2, But processData assumes x exists in myPackage.
Thanks for the reply! I think I got that part, though. My train of thought was that if I could fetch x from the worker and pass it again when calling runProcessData (see below), there should also be a way to avoid this inefficient communication.
using Distributed
addprocs(1)
@everywhere module myPackage
using Distributed
# function to create variable on worker
function createData()
@everywhere 2 begin
x = rand(10)
end
end
# function to execute on worker
function processData(x,y)
return x .* y
end
# function to be wrapped
function runProcessData(x,y)
return fetch(@spawnat 2 processData(x,y))
end
export createData, runProcessData
end
@everywhere using .myPackage
createData()
runProcessData(fetch(@spawnat 2 eval(:x)),10)
As a solution, I defined a function to spawn workers outside the package and now pass the function as an argument.
using Distributed
addprocs(1)
@everywhere module myPackage
using Distributed
# function to create variable on worker
function createData()
@everywhere 2 begin
x = rand(10)
end
end
# function to execute on worker
function processData(x,y)
return x .* y
end
# function to be wrapped
function runProcessData(y::Float64,spawn::Function)
fut_obj::Future = spawn(2,y)
out_arr::Vector{Float64} = fetch(fut_obj)
return out_arr
end
export createData, processData, runProcessData
end
@everywhere begin
using .myPackage
spawnProcessData(i::Int64, y::Float64) = @spawnat i processData(x,y)
end
createData()
runProcessData(10.0,spawnProcessData)