Parallel for loop without reduction


I believe I have a problem using a parallel for loop without a reduction operator/function. I tried the following code, hoping to get a pmap-like behaviour:

a = @parallel for i in 1:100

But this was the result:

1-element Array{Future,1}:

Also fetch(a) gives me the same thing:

1-element Array{Future,1}:

And fetch(a[1]) sometimes freezes the Julia session and sometimes returns nothing. Please let me know how to use the parallel for loop in a way that mimics pmap.

Also is there a similar abstraction to this for GPUs, i.e. a GPU for loop? I know ArrayFire in C++ offers a gfor loop, but is there something similar which uses CUDANative.jl?


Reduce with (vcat).

Why not just use pmap with a larger batch size?


Right, I could. I tried reducing with vcat, it worked, but it was super slow. I could use pmap, but I found this example in the documentation and thought it was often more convenient when programming, so I wondered why it doesn’t work.

I think changing this to a pmap of an anonymous function behind the scenes could make it faster than using vcat, maybe worth modifying the source code of @parallel.


Maybe look at @parallel with SharedArray?

a = SharedArray(Int,100);
@parallel for i in 1:100
    a[i] = i^2