It seems like a solved problem now.
$ killall -SIGSTOP firefox-bin
$ julia +1.11 -t 8 --project=. main.jl
0.084158 seconds (45 allocations: 6.108 MiB)
$ killall -SIGCONT firefox-bin
I get 7.47x for 8 threads, 93% of the ideal speedup, but from 8-16 cores not much extra speed, most likely explained by hyper-threading. It’s just known to be worse, not as good as a full core. At least if CPU bound.
I.e. with the improved code from @yolhan_mannes Threads are not speeding up computation a lot - #8 by yolhan_mannes
Is this as fast as possible?
using ThreadPinning
# GC.gc(); GC.gc() ;GC.gc()
# pinthreads(:cores) # at best minimally helping
#GC.enable(false)
@time test_gen_keys_range();