@threads vs @spawn

Hi, would appreciate some feedback about this issue. I’m trying to understand the difference between @threads and @spawn.

I think the former is easy to understand. I have an experiment shuffling cards, many many times. See if royal flush appears. I have supplied --threads=8 in the command line. Thus the following loop processes 8 batches in parallel at any one time. It will finish only when all 20 batches complete.

    batches = 20
    @threads for i in 1:batches
        batch!(...)
    end

Indeed, that’s what it does.

But I read about @spawn first before I understand the simpler syntax of @threads. So I had written the loop this way in the beginning:

@sync for i in 1:batches
    Threads.@spawn batch!(...)
end

I thought it also does the same thing as above… Are they equivalent? This spawns 20 threads at once. I guess 8 of them will run at any one time. @sync should wait until they all completed.

Apparently performance-wise they are not the same. The second one doesn’t behave like the first. I can see in “top” the julia process started with 800% cpu (correct… 8 threads). But after completing two batches, CPU dropped down to 400%… then after a much longer time a couple more batches completed. Then CPU dropped down to 200%. Eventually CPU dropped down to 100% without any more batch completion. It’s obviously not what I intended. And the experiment ran too long I had to Ctrl-C it.

How come?

Thanks

Happy to supply source code (about 100 lines). Let me know if anyone wants to see it.

1 Like

In order to see what happens, I suggest two little test functions returning the thread load:

using .Threads

function threaded(batches)
    ret = zeros(Int, nthreads())
    @threads for i in 1:batches
        ret[threadid()] += 1
    end
    return ret
end

function spawned(batches)
    ret = zeros(Int, nthreads())
    @sync for i in 1:batches
        Threads.@spawn ret[threadid()] += 1
    end
    return ret
end

then:

julia> threaded(20)
8-element Array{Int64,1}:
 3
 3
 3
 3
 2
 2
 2
 2

julia> spawned(20)
8-element Array{Int64,1}:
 0
 8
 9
 1
 2
 0
 0
 0

If you want uniformly loaded threads for parallel computation, use @threads for ....

@spawn is more for randomly spawning concurrent tasks. The manual has for it:

Create and run a Task on any available thread.

There is no guarantee for balanced load. In a bad case you can get even:

julia> spawned(20)
8-element Array{Int64,1}:
  1
 13
  1
  1
  1
  1
  1
  1
11 Likes

Thanks a lot. I knew someone would know right away. This is an awesome language. So fast, so easy to write parallel threads, so many experts willing to help.

Thanks again.

3 Likes