Is there any limitation on threads' num for cpu without SMT?

I’m using 9600KF with 6 cores and 6 threads, the following confuse me:
If I start julia with 6 threads, then the cost of schedule(t::Task) = enq_work(t) increases sharply somtime, while everythings seems ok with threads under 6.

FFTW.fftw_vendor # :mkl
FFTW.set_num_threads(5)
# initialization
a = randn(ComplexF32,512,6)
b = randn(ComplexF32,512,6)
c = similar(a)
# bench function
test(a,b,c) = begin
plan = FFTW.plan_fft(a, 1)
for k = 1:1000
    mul!(b,plan,a)
    tasks = ntuple(1) do i # even i start only 2 tasks
        Threads.@spawn @views c[:,i] .= a[:,i] .* b[:,i]
    end
    @views c[:,6] .= a[:,6] .* b[:,6]
    map(wait,tasks)
end
end
# use Juno.@profiler
@profiler test(a,b,c)

Here’s the profile result:

Is this a bug or something else? I found the default threads in Juno for 9600KF is 3, althought it says auto means the number of cpu cores.

Well, leaving one CPU core for the operating system is often a good idea…

In fact, I only find this if @spawn is used with MKL’s fftw (MKL’s BLAS is ok). :sweat_smile:.
And default FFTW3 is ok…, Maybe I should post a issue to FFTW.jl ? :thinking: