I was a bit confused by the docs:
In a multi-threaded program, you should generally use different RNG objects from different threads or tasks in order to be thread-safe. However, the default RNG is thread-safe as of Julia 1.3 (using a per-thread RNG up to version 1.6, and per-task thereafter).
They recommend generally using explicit RNGs for each task, but don’t give a reason why. Since the default RNG is now thread-safe, what is the reason? Performance?
I ran a simple test on a 1.8 nightly of Julia and was also confused by the results. Using an explicit Xoshiro
RNG is slower on average than an explicit TaskLocalRNG
. (The explicit TaskLocalRNG
performs the same as not passing any RNG and using the default.)
julia> function f(rng_call, n)
Threads.@threads for _ in 1:Threads.nthreads()
rng = rng_call()
sum(rand(rng) for _ in 1:n)
end
end
f (generic function with 1 method)
julia> Threads.nthreads()
8
julia> @benchmark f(Random.Xoshiro, 10^6)
BenchmarkTools.Trial: 2151 samples with 1 evaluation.
Range (min … max): 981.468 μs … 15.590 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.306 ms ┊ GC (median): 0.00%
Time (mean ± σ): 2.313 ms ± 1.369 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
▇▃▂▁▁ ▁█▆▃▅▁▁▂ ▁▄▃ ▁
████████████████████▆▄▅▄▆▃▄▃▁▃▃▃▁▃▃▁▃▁▄▁▁▄▄▄▁▃▃▃▁▄▄▃▃▄▄▃▄▃▁▄ █
981 μs Histogram: log(frequency) by time 9.15 ms <
Memory estimate: 5.00 KiB, allocs estimate: 81.
julia> @benchmark f(Random.TaskLocalRNG, 10^6)
BenchmarkTools.Trial: 2902 samples with 1 evaluation.
Range (min … max): 1.026 ms … 16.076 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.088 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.716 ms ± 998.001 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
██▁ ▂▅▅▃▃ ▁▃▂▁ ▁ ▁
███▆▅▃▃▄▄▄▄▁▃▄▅▃▃▃▄▁▁▁▁█▇▇▄▆████████████▅▆▆▃▄▅▆▆▇▄▁▃▆▅▆▆▇██ █
1.03 ms Histogram: log(frequency) by time 3.76 ms <
Memory estimate: 4.50 KiB, allocs estimate: 65.