I think I answered my question, which is related to the discussion here: Thread affinitization: pinning Julia threads to cores
If I use only 48 threads from one of the two cpus by ThreadPinning.jl, the time of Octavian.jl reduces to 1.5-1.6s.
I think I answered my question, which is related to the discussion here: Thread affinitization: pinning Julia threads to cores
If I use only 48 threads from one of the two cpus by ThreadPinning.jl, the time of Octavian.jl reduces to 1.5-1.6s.