Suppose I have an array of type e.g. Array{Float64} and I want to fill it with random bits as fast as possible. Note that that’s different from filling it with random floats between 0 and 1. A straight forward way to do it is
rand!(reinterpret(UInt64, a))
which is fine, but the fastest way I’ve found to do it is
GC.@preserve a rand!(Random.UnsafeView{UInt64}(pointer(a), length(a)))
I’m a little torn, because I want my package to have the best performance possible, but I also want my package to be reliable without the maintenance burden that comes with using private internals.
Is it possible to get the performance of the second method using only the public interface?
I’m working on a package for the ziggurat algorithm, Ziggurats.jl. It’s the same algorithm used by randn and randexp, but Ziggurats.jl allows user specified probability density functions. Without going into too many algorithmic details, to make one sample you generate a UInt64, then do some table look ups, math, etc and 99% of the time you get a result otherwise you regenerate the UInt64 and retry. When producing values in bulk, rather than generating one UInt at a time, it can be faster to just fill the array with random bits and use those as the first UInt tried for each sample. So, in Random.jl randn and randexp have lines something like this:
GC.@preserve A rand!(rng, UnsafeView{UInt64}(pointer(A), length(A)))
for i in eachindex(A)
@inbounds A[i] = _ziggurat_rand(rng, reinterpret(UInt64, A[i]))
end
Where 1% of the time _ziggurat_rand will call itself recursively with a new UInt _ziggurat_rand(rng, rand(UInt64))
I could do the same thing, but I don’t want Random.jl to change the interface out from under me.