Sampling without replacement

rand(1:20, 3)

gives three random integers in [1:20] with replacement. I see there are some threads on github on this issue, but I am not sure what is the recommended way with Julia 0.5 to sample without replacement.

5 Likes

See the StatsBase package: http://juliastats.github.io/StatsBase.jl/stable/sampling.html#Sampling-API-1

4 Likes

Without using external packages you can use randperm and take the first three elements:

randperm(20)[1:3]

But this may not be very efficient, as it needs to create first an array of 20 elements.

8 Likes

You can use splice! to pick random elements and remove them from the collection:

function sample_wo_repl!(A,n)
    sample = Array{eltype(A)}(n)
    for i in 1:n
        sample[i] = splice!(A, rand(eachindex(A)))
    end
    return sample
end
1 Like

There is also randsubseq in Base. This efficiently samples an array without replacement, but with a given probability (per element) rather than a given number of elements. e.g. randsubseq(1:20, 0.15) produces 3 elements on average.

2 Likes

Update: randperm and randsubseq have been moved from Base to Random

It’s still probably better to use StatsBase to solve this problem.

(Commenting here because this thread is the first result when I google “Julia sample without replacement”)

8 Likes

Update: the StatsBase URL has changed (again): Sampling from Population · StatsBase.jl

1 Like

I can’t edit the above post anymore (when I click edit it only shows the history) - seems there is an editing limit?

This is pretty handy, really. Thanks!

Didn’t see the following in this thread:

julia> using StatsBase

julia> sample(1:20, 3; replace=false)
3-element Vector{Int64}:
  5
 20
 18

and it feels like this is the “canonical” solution.

7 Likes