I need a very fast random sampler to choose n numbers from a precomputed 1D array, say X. We can assume that n is much smaller than length(X). I have tried rand(X,n), [rand(X) for i=1:n] , but neither gives satisfactory performance, even if n is as small as 3. The profiling shows that the bottleneck is in Random/generation.jl line 421

function rand(rng::AbstractRNG, sp::SamplerRangeNDL{U,T}) where {U,T}
s = sp.s
x = widen(rand(rng, U))
m = x * s
l = m % U
if l < s
t = mod(-s, s) # as s is unsigned, -s is equal to 2^L - s in the paper
while l < t
x = widen(rand(rng, U))
m = x * s
l = m % U
end
end
(s == 0 ? x : m >> (8*sizeof(U))) % T + sp.a
end

I wish to customize the sampler down to a very particular case of sampling three Float64 out of a Vector{Float64}, which may look like

function my_rand_3(rng::AbstractRNG, X::Vector{Float64})::Vec3
...
...
return Vec3(...)
end

Is the sampling with or without replacement?
Also, the subject-line specifies sampling 3 out of 4097, these are specific numbers and a method can be optimized for them, or is the method required more general?

The number 3 comes from the demand of generating random three-dimensional vector. The number 4079 is the size of the pre-computed amplitudes, which satisfy some distribution. A more refined pre-computation would yield more numbers.

I think the sampling is just to take and consume numbers, without modification of the pool.