Any way to reduce/change number of threads (for the Benchmark Game)

A.
Using 4 threads adds 20.3 ms (37 on average, but I trust the minimum better) to startup (would there be a way for Julia do delay adding them until the threads actually used?).

The obvious solution is not asking for 4 threads (in that case a single threaded program), but the guy behind the Benchmark Game declined, wants the same settings for all programs, multi-threaded or not (the solution might be to start with only 1, and ask all to add 4 threads; is there a way to easily add as many as “-tauto” does from within the program?).

$ hyperfine 'julia -t4 -O0 --cpu-target=core2 --startup-file=no pidigits.jl 1000 >/dev/null'
Benchmark #1: julia -t4 -O0 --cpu-target=core2 --startup-file=no pidigits.jl 1000 >/dev/null
  Time (mean ± σ):     307.6 ms ±  25.6 ms    [User: 593.6 ms, System: 393.8 ms]
  Range (min … max):   270.0 ms … 341.5 ms    10 runs

$ hyperfine 'julia -O0 --cpu-target=core2 --startup-file=no pidigits.jl 1000 >/dev/null'
Benchmark #1: julia -O0 --cpu-target=core2 --startup-file=no pidigits.jl 1000 >/dev/null
  Time (mean ± σ):     270.5 ms ±  24.9 ms    [User: 536.8 ms, System: 366.7 ms]
  Range (min … max):   249.7 ms … 317.8 ms    10 runs

Go language starts with as many procs as number of cores by default, but has, what their Pidigits version of the program uses:

runtime.GOMAXPROCS(1)

The cost for us seems to be only for startup, so can’t get that time back, but anyone know if there’s a reason to lower threads afterward (does it help with GC? I think that might be the reason for Go).

B.
Chapel uses “yield”, and while I’m not better for performance, despite their program fastest:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/pidigits-chapel-4.html

Is it worth looking into (as you can’t use a package):
https://github.com/BenLauwens/ResumableFunctions.jl

$ time julia -O2 --startup-file=no pidigits.jl 10000 >/dev/null

real	0m1,233s
user	0m1,628s
sys	0m0,477s

julia> GC.gc(); GC.gc(); GC.gc(); GC.gc(); GC.enable(false); @benchmark (GC.enable(false); pidigits(10000, devnull); GC.enable(true);) gcsample=true
BenchmarkTools.Trial: 
  memory estimate:  859.20 MiB
  allocs estimate:  75584
  --------------
  minimum time:     920.721 ms (0.00% GC)
  median time:      930.629 ms (0.00% GC)
  mean time:        930.308 ms (0.00% GC)
  maximum time:     939.252 ms (0.00% GC)
  --------------
  samples:          4
  evals/sample:     1