rand(RGB{Float32},1080,1980)
is now faster than everything else recommended in this thread.
It’s been fixed in the (not yet released) commit Use samplerbased Random API (#222) · JuliaGraphics/ColorTypes.jl@5e3aeb5 · GitHub) of ColorTypes.jl.
Consolidated benchmarks of every method from this thread after ]develop ColorTypes
and Julia restart (or presumably ]add ColorTypes
once 0.12 is released):
using RandomNumbers, BenchmarkTools, Random, Colors
rng = Random.default_rng()
rng_xor = RandomNumbers.Xorshifts.Xoroshiro128Star()
f1() = rand(RGB{Float32},1080,1980)
function f2()
M = Matrix{RGB{Float32}}(undef, 1080, 1980)
@inbounds for j in 1:1980
for i in 1:1080
M[i,j] = RGB(rand(), rand(), rand())
end
end
return M
end
function f2_rng(rng)
M = Matrix{RGB{Float32}}(undef, 1080, 1980)
@inbounds for j in 1:1980
for i in 1:1080
M[i,j] = RGB(rand(rng), rand(rng), rand(rng))
end
end
return M
end
function f3()
M = Matrix{RGB{Float32}}(undef, 1080, 1980)
@. M = RGB(rand(), rand(), rand())
return M
end
function f4(rng)
M = Matrix{RGB{Float32}}(undef, 1080, 1980)
@. M = RGB(rand(rng), rand(rng), rand(rng))
return M
end
@btime rand(RGB{Float32},1080,1980) seconds=1
# 9.539 ms (4 allocations: 24.47 MiB)
@btime rand($rng, RGB{Float32},1080,1980) seconds=1
# 9.704 ms (4 allocations: 24.47 MiB)
@btime rand($rng_xor, RGB{Float32},1080,1980) seconds=1
# 12.198 ms (4 allocations: 24.47 MiB)
@btime f1() seconds=1
# 9.622 ms (4 allocations: 24.47 MiB)
@btime f2() seconds=1
# 26.670 ms (2 allocations: 24.47 MiB)
@btime f2_rng($rng) seconds=1
# 11.479 ms (2 allocations: 24.47 MiB)
@btime f2_rng($rng_xor) seconds=1
# 11.054 ms (2 allocations: 24.47 MiB)
@btime f3() seconds=1
# 27.571 ms (2 allocations: 24.47 MiB)
@btime f2_rng($rng) seconds=1
# 11.446 ms (2 allocations: 24.47 MiB)
@btime f2_rng($rng_xor) seconds=1
# 11.022 ms (2 allocations: 24.47 MiB)
How we got here:
 Calls to
rand()
are thread safe by assigning each thread its own MersenneTwister
which comes at a slight performance penalty at the time the thread’s generator is selected.
 Vectorized
rand(1080, 1980)
calls look up the thread’s generator first with default_rng()
, and pass that generator along so that the lookup happens only once.

rand(RGB{Float32}, 1080, 1980)
used to be slow because ColorTypes.jl
did not explicitly pass random number generators around internally, resulting in 1080*19801
unnecessary calls to default_rng()