Threads are not speeding up computation a lot

It seems like a solved problem now.

$ killall -SIGSTOP firefox-bin

$ julia +1.11 -t 8 --project=. main.jl
  0.084158 seconds (45 allocations: 6.108 MiB)

$ killall -SIGCONT firefox-bin

I get 7.47x for 8 threads, 93% of the ideal speedup, but from 8-16 cores not much extra speed, most likely explained by hyper-threading. It’s just known to be worse, not as good as a full core. At least if CPU bound.

I.e. with the improved code from @yolhan_mannes Threads are not speeding up computation a lot - #8 by yolhan_mannes

Is this as fast as possible?

using ThreadPinning
# GC.gc(); GC.gc() ;GC.gc()
# pinthreads(:cores)  # at best minimally helping
#GC.enable(false)
@time test_gen_keys_range();

I thought it wasn’t just “not as good”, but that there was zero benefit when cpu is fully utilized.

Yes, but it’s pretty hard to “fully utilize” a core (always have all execution units busy, never branch mispredict, never wait on memory).

To be cheeky, I’d say that workloads that fully utilize a core are so homogenous and non-branchy that they should run on GPU :wink: