Different spawning behavior of @spawn and Channel(..., spawn=true)

Channel(..., spawn=true) seems to spawn not to all available threads. An MWE:

using .Threads

function tinfo(ch::Channel)
    from = take!(ch)
    put!(from, threadid())
end

me = Channel(10)

If I use Threads.@spawn to start the tasks …

for i in 1:nthreads()
    ch = Channel(1)
    Threads.@spawn tinfo(ch)
    yield()
    put!(ch, me)
end

julia> map(x->take!(me), 1:nthreads())
8-element Array{Int64,1}:
 2
 3
 7
 1
 6
 4
 5
 8

… I always get all threads as someone would expect.

If I do the same with Channel(..., spawn=true), I don’t get all threads:

for i in 1:nthreads()
    ch = Channel(tinfo, 1, spawn=true)
    yield()
    put!(ch, me)
end

julia> map(x->take!(me), 1:nthreads())
8-element Array{Int64,1}:
 3
 2
 8
 3
 4
 8
 8
 8

This occurs most of the time. Only once with a fresh REPL I got all threads.

Can you confirm this on your machines? Is this behavior expected? Can I change it? Should I write an issue?

I’m on a Mac, Julia 1.5.2:

Versioninfo:
julia> versioninfo()
Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 8
  JULIA_EDITOR = "/Applications/Visual Studio Code.app/Contents/Resources/app/bin/code"

I have experienced similar issues when I have started threads with @spawn that started their life with waiting for something.

I guess your issue may be related: the scheduler thinks that the threads are available and reuses them instead of round-robin-ing. For me it seemed that this is not a bug, although it was very annoying. I ended up using your solution from here:

Now it is a little bit different since I’m developing an actor library and I want parallel actors to use all available threads.

I could use the solution you mentioned to start tasks on predetermined threads. But then I had to do the load-balancing myself, which I don’t want to. In a first approach Threads.@spawn seems to work for me but not Channel(..., spawn=true).

But following the documentation for Channel:

If spawn = true , the Task created for func may be scheduled on another thread in parallel, equivalent to creating a task via Threads.@spawn .

there should be no difference between the two.

Yes, it seems that there is a difference (can confirm on Linux, v1.5.2)

It may be worth opening an issue, I would also be happy to see more deterministic scheduling behavior when having only a few threaded tasks.

On the other hand I am not sure that filling all threads with tasks as soon as possible is the best strategy for every situation, so promising it in Base would not be wise. Maybe that’s why the docs of @spawn says “any available thread”:

After looking at the Threads.@spawn macro I guessed that the spawning may be linked to the scope where the macro is executed. If I execute the same for loop in a local scope, I get the same observed behavior as for Channel(..., spawn=true):

function dospawn()
    for i in 1:nthreads()
        ch = Channel(1)
        Threads.@spawn tinfo(ch)
        yield()
        put!(ch, me)
    end
end    

julia> map(x->take!(me), 1:nthreads())
8-element Array{Int64,1}:
 2
 3
 7
 6
 4
 2
 7
 3