Maybe I don’t understand how @spawn works, but here is what I am observing: When executing a parallel task with it, I observe that the performance is significantly worse if I have more threads available (i. e. starting julia with -tX with greater X) even if I am not spawning more threads.
An example code is bellow, but essentially what I see is:
- Start julia with 
julia -t32and execute the code spawning32threads: 216 μs - Start julia with 
julia -t64and execute the code spawning32threads: 3.323 ms 
How to understand this? The result of that is that in a parallel code, even if I know that it is not worth to spawn more threads because of the limited scalability, the performance gets worse just for the fact that more threads are available.
(FWIW, this is in a computer with 128 real cores)
Test code:
using Base.Threads: @spawn
using LinearAlgebra: norm
using StaticArrays
using Test
# Parallel code for sum(norm.( x - y for x in x, y in y ))
function sumd(x,y,nbatches;aux=zeros(nbatches))
    @assert length(x)%nbatches == 0
    batchsize = length(x) ÷ nbatches
    aux .= 0.
    @sync for ibatch in 1:nbatches
        ifirst = (ibatch-1)*batchsize + 1
        ilast = ibatch*batchsize
        @spawn begin
            for i in ifirst:ilast
                for j in 1:length(y)
                    @inbounds aux[ibatch] += norm(y[j]-x[i])
                end
            end
        end
    end
    return sum(aux)
end
function test(nbatches)
    x = [ rand(SVector{3,Float64}) for _ in 1:6400 ]
    y = [ rand(SVector{3,Float64}) for _ in 1:10 ]
    @test sumd(x,y,nbatches) ≈ sum(norm.( x - y for x in x, y in y ))
    aux = zeros(nbatches)
    @btime sumd($x,$y,$nbatches;aux=$aux)
end
Here is the step-by-step:
Start Julia with 32 threads, and spawn 32 tasks:
[lmartine@adano58 old]$ JULIA_EXCLUSIVE=1 julia -t32
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.0 (2021-11-30)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |
julia> include("./test.jl")
test (generic function with 1 method)
julia> test(32)
  216.054 μs (205 allocations: 17.83 KiB)
42508.465422704496
Start Julia with 64 threads, and spawn again 32 tasks:
[lmartine@adano58 old]$ JULIA_EXCLUSIVE=1 julia -t64
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.0 (2021-11-30)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |
julia> include("./test.jl")
test (generic function with 1 method)
julia> test(32)
  3.323 ms (205 allocations: 17.83 KiB)
40862.74149814549