I have a loop that is embarrassingly parallel and in which there is no data movement. What is the best way nowadays to parallelize it in Julia v0.6?
Assume I have an iterator type:
struct MyIterator
length::Int
end
Base.start(itr::MyIterator) = 1
Base.next(itr::MyIterator, state) = state, state + 1
Base.done(itr::MyIterator, state) = state == itr.length + 1
The loop that I want to parallelize looks as follows:
for x in MyIterator(10)
y = f(x)
# save into an array of results
push!(ys, y)
end
where f
is a complicated function. I tried just adding the @parallel
macro in front of the loop:
ys = @parallel for x in MyIterator(10)
f(x)
end
and fetching the results
ys = fetch(ys)
but it throws an error saying that length(::MyIterator)
is not defined. It made sense to me, to parallelize we need to know the length beforehand. I then defined the length:
Base.length(itr::MyIterator) = itr.length
and the loop run without errors. However, the result is an array of Future
, and when I run fetch
I get the same array. I then tried running fetch.(ys)
with the extra dot and it returned a list of RemoteException
, so clearly something is still missing.
I appreciate if someone can elaborate on the expected interface an iterator has to obey in order to be parallelizable. Also, if you have suggestions on how to solve this issue that is not using the @parallel
macro, I am willing to learn.