PhiloxRNG.jl is a package for generating random numbers on both the CPU and GPU. It will generate similar numbers on all devices (results are not exactly the same when sampling floating point distributions due to fast math differences).
The underlying algorithm is a 10 round Philox4x32 combined with a fast boxmuller transformation for sampling the normal distribution.
Performance is currently significantly better than randn! Float32, but slower in other cases.
Benchmarks
Julia 1.12.5, CUDA 5.11.0, AMD Ryzen 7 9800X3D, NVIDIA GeForce RTX 3080.
CPU (ns/value, N = 100,000,000)
| Function | PhiloxRNG.jl | Random.jl |
|---|---|---|
rand F32 |
0.791 | 0.522 |
rand F64 |
1.997 | 1.052 |
randn F32 |
1.009 | 2.114 |
randn F64 |
3.098 | 1.795 |
GPU (ns/value, N = 100,000,000)
| Function | PhiloxRNG.jl | CUDA.jl |
|---|---|---|
rand F32 |
0.006 | 0.006 |
randn F32 |
0.007 | 0.032 |
Random123.jl also implements the philox family of RNGs. The main difference is the API. Random123.jl uses a mutable struct AbstractRNG interface, while PhiloxRNG.jl uses pure functions. This can make PhiloxRNG.jl easier to use in some situations.