Parallel code seems slow



I have a code that I want to run multiple times with different random inputs to compute an average overall all iterations.

To get more familiar with parallel programming in Julia, I tried to play around with a small example :

n = 10^3

A = SharedArray{Int}(n)

@time begin
    @sync @parallel for i in 1:n
        A[i] = i

I ran it a few times and I got some disappointing results.

For n = 10^3, I got

For n = 10^5, I got

For n = 10^7, I got

It seems a bit faster to use julia -p 2 than julia -p 1, but it is definitely faster to use only julia.

I also try without the parallel stuff and obtain faster times. More precisely, for n = 10^3, 10^5, 10^7, I get

Finally, I also ran the code on a cluster and got similar results.

Am I doing something wrong or is there a problem with parallel programming in Julia?


If you’re only running the code once, then you’re probably mostly just timing compilation. At the very least, I would suggest putting the code in a function and then using to get a reliable time estimate.

Also, I’m far from an expert in parallel computing, but in an example like the one you’ve posted, in which every operation is trivially cheap, I would suspect that the overhead of sending data between processes vastly outweighs the performance gain from computing i = i at each iteration.


You will need to have your code inside a function and compile it first:

n = 10^7

A = SharedArray{Int}(n)

function myfill!(A,n)
    @sync @parallel for i in 1:n
        A[i] = i
@time myfill!(A,n)

Try timing with the code above.


Yeah, that was exactly the problem. I taught that the @parallel and @sync would be pre-compile. I was wrong.

Thanks !