I was a bit confused by the docs:
In a multi-threaded program, you should generally use different RNG objects from different threads or tasks in order to be thread-safe. However, the default RNG is thread-safe as of Julia 1.3 (using a per-thread RNG up to version 1.6, and per-task thereafter).
They recommend generally using explicit RNGs for each task, but donβt give a reason why. Since the default RNG is now thread-safe, what is the reason? Performance?
I ran a simple test on a 1.8 nightly of Julia and was also confused by the results. Using an explicit Xoshiro RNG is slower on average than an explicit TaskLocalRNG. (The explicit TaskLocalRNG performs the same as not passing any RNG and using the default.)
julia> function f(rng_call, n)
Threads.@threads for _ in 1:Threads.nthreads()
rng = rng_call()
sum(rand(rng) for _ in 1:n)
end
end
f (generic function with 1 method)
julia> Threads.nthreads()
8
julia> @benchmark f(Random.Xoshiro, 10^6)
BenchmarkTools.Trial: 2151 samples with 1 evaluation.
Range (min β¦ max): 981.468 ΞΌs β¦ 15.590 ms β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 2.306 ms β GC (median): 0.00%
Time (mean Β± Ο): 2.313 ms Β± 1.369 ms β GC (mean Β± Ο): 0.00% Β± 0.00%
βββββ βββββ
βββ βββ β
βββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββ β
981 ΞΌs Histogram: log(frequency) by time 9.15 ms <
Memory estimate: 5.00 KiB, allocs estimate: 81.
julia> @benchmark f(Random.TaskLocalRNG, 10^6)
BenchmarkTools.Trial: 2902 samples with 1 evaluation.
Range (min β¦ max): 1.026 ms β¦ 16.076 ms β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 1.088 ms β GC (median): 0.00%
Time (mean Β± Ο): 1.716 ms Β± 998.001 ΞΌs β GC (mean Β± Ο): 0.00% Β± 0.00%
βββ ββ
β
ββ ββββ β β
βββββ
ββββββββββ
ββββββββββββββββββββββββββ
βββββ
ββββββββ
βββββ β
1.03 ms Histogram: log(frequency) by time 3.76 ms <
Memory estimate: 4.50 KiB, allocs estimate: 65.