I was a bit confused by the docs:
In a multi-threaded program, you should generally use different RNG objects from different threads or tasks in order to be thread-safe. However, the default RNG is thread-safe as of Julia 1.3 (using a per-thread RNG up to version 1.6, and per-task thereafter).
They recommend generally using explicit RNGs for each task, but donβt give a reason why. Since the default RNG is now thread-safe, what is the reason? Performance?
I ran a simple test on a 1.8 nightly of Julia and was also confused by the results. Using an explicit Xoshiro
RNG is slower on average than an explicit TaskLocalRNG
. (The explicit TaskLocalRNG
performs the same as not passing any RNG and using the default.)
julia> function f(rng_call, n)
Threads.@threads for _ in 1:Threads.nthreads()
rng = rng_call()
sum(rand(rng) for _ in 1:n)
end
end
f (generic function with 1 method)
julia> Threads.nthreads()
8
julia> @benchmark f(Random.Xoshiro, 10^6)
BenchmarkTools.Trial: 2151 samples with 1 evaluation.
Range (min β¦ max): 981.468 ΞΌs β¦ 15.590 ms β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 2.306 ms β GC (median): 0.00%
Time (mean Β± Ο): 2.313 ms Β± 1.369 ms β GC (mean Β± Ο): 0.00% Β± 0.00%
βββββ βββββ
βββ βββ β
βββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββ β
981 ΞΌs Histogram: log(frequency) by time 9.15 ms <
Memory estimate: 5.00 KiB, allocs estimate: 81.
julia> @benchmark f(Random.TaskLocalRNG, 10^6)
BenchmarkTools.Trial: 2902 samples with 1 evaluation.
Range (min β¦ max): 1.026 ms β¦ 16.076 ms β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 1.088 ms β GC (median): 0.00%
Time (mean Β± Ο): 1.716 ms Β± 998.001 ΞΌs β GC (mean Β± Ο): 0.00% Β± 0.00%
βββ ββ
β
ββ ββββ β β
βββββ
ββββββββββ
ββββββββββββββββββββββββββ
βββββ
ββββββββ
βββββ β
1.03 ms Histogram: log(frequency) by time 3.76 ms <
Memory estimate: 4.50 KiB, allocs estimate: 65.