Looking for code to solve (surely common) 'embarrassing parallelism' multithreading use-case

I would say the cut-off is even lower than this.
Don’t worry about grouping them if they in general take over 1ms.
@spawn is fast.
It’s not so fast that careful things like ThreadPools (or @thread for) can’t do better for sure.
But it is fast enough that my cut-off for worrying about grouping them is <1ms.
Depending exactly how much you care about getting the last few bit of performance out you might start worrying earlier (or later).

julia> @btime Threads.@spawn 1;
  140.650 ns (4 allocations: 352 bytes)

julia> @btime fetch(Threads.@spawn 1);
  13.604 μs (4 allocations: 352 bytes)
5 Likes