Customize a random function to sample 3 out of a list of 4097 real numbers

I need a very fast random sampler to choose `n` numbers from a precomputed 1D array, say `X`. We can assume that `n` is much smaller than `length(X)`. I have tried `rand(X,n)`, `[rand(X) for i=1:n]` , but neither gives satisfactory performance, even if `n` is as small as 3. The profiling shows that the bottleneck is in Random/generation.jl line 421

``````rand(rng::AbstractRNG, sp::SamplerSimple{<:AbstractArray,<:Sampler}) =
@inbounds return sp[][rand(rng, sp.data)]
``````

then eventually line 342

``````function rand(rng::AbstractRNG, sp::SamplerRangeNDL{U,T}) where {U,T}
s = sp.s
x = widen(rand(rng, U))
m = x * s
l = m % U
if l < s
t = mod(-s, s) # as s is unsigned, -s is equal to 2^L - s in the paper
while l < t
x = widen(rand(rng, U))
m = x * s
l = m % U
end
end
(s == 0 ? x : m >> (8*sizeof(U))) % T + sp.a
end
``````

I wish to customize the sampler down to a very particular case of sampling three `Float64` out of a `Vector{Float64}`, which may look like

``````function my_rand_3(rng::AbstractRNG, X::Vector{Float64})::Vec3
...
...
return Vec3(...)
end
``````

Can anyone give me a hint on how to do that?

I am doing a simulator for vibrating atoms in a solid, this sampler is very useful and very frequently called.

IDK what your current benchmarks say, but here is an alternative version that just populates a preallocated array:

``````using Random
using BenchmarkTools

x = rand(Float64, 4079)
xx = zeros(Float64, 3)

@benchmark rand!(\$xx, \$x)
``````

On my laptop this gives

``````julia> @benchmark rand!(\$xx, \$x)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max):  17.124 ns … 46.397 ns  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     17.813 ns              ┊ GC (median):    0.00%
Time  (mean ± σ):   17.937 ns ±  0.669 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

▂▄▆█▆▃▁
▂▂▂▂▃▄▆▇████████▇▆▅▅▅▅▄▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂ ▃
17.1 ns         Histogram: frequency by time        20.4 ns <

Memory estimate: 0 bytes, allocs estimate: 0.
``````
3 Likes

Is the sampling with or without replacement?
Also, the subject-line specifies sampling 3 out of 4097, these are specific numbers and a method can be optimized for them, or is the method required more general?

2 Likes

What a cute exclamation mark! Thank you so much!

The number 3 comes from the demand of generating random three-dimensional vector. The number 4079 is the size of the pre-computed amplitudes, which satisfy some distribution. A more refined pre-computation would yield more numbers.

I think the sampling is just to take and consume numbers, without modification of the pool.