Efficient way of creating a vector of length L with unique elements

Hi,

I am looking to fill a vector of a desired length up with randomly chosen, unique elements. Currently I have the code below which has the desired effect however, I suspect I can speed it up. If anyone has any ideas it would be appreciated

function randomly_filling_vector(R1, R2, L)
    
out = [Vector{Int64}(undef,3) for _ in 1:L]
    condit = false
    while condit == false  
        for i ∈ eachindex(out)
            out[i] = [rand(DiscreteUniform(1, R1)), rand(DiscreteUniform(2, R2-1)), rand(DiscreteUniform(1, 6))]
        end
        unique = unique!(out)
        if length(unique) == L
            condit = true
            return out
        end
    end
end
1 Like

This is called “random sampling without replacement” and is implemented e.g. by StatsBase.sample.

See also Sampling from a list of integers without repetition and Sampling without replacement.

1 Like

There is a problem with the algorithm you implemented. The line

uses unique! which mutates the out array. This array will get shortened when there are repeating elements in out and subsequently will never be lengthened back, as eachindex(out) in the for earlier will loop over shortened array. Thus the function will enter an infinite loop.

So, first, fixed OP code:

function randomly_filling_vector(R1, R2, L)
    out = [Vector{Int64}(undef,3) for _ in 1:L]
    condit = false
    while condit == false  
        for i ∈ eachindex(out)
            out[i] = [rand(DiscreteUniform(1, R1)), rand(DiscreteUniform(2, R2-1)), rand(DiscreteUniform(1, 6))]
        end
        uniq = unique(out)
        if length(uniq) == L
            condit = true
            return out
        end
    end
end

But, as stevengj mentioned, this is work for sample from Random package. The following is an efficient way to implement this:

function randomly_filling_vector2(R1, R2, L)
    S = R1*(R2-2)*6
    map(sample(1:S, L; replace=false)) do s
        d, m = divrem(s, 6)
        c3 = m+1
        d, m = divrem(d, R2-2)
        c2 = m+2
        c1 = d+1
        [c1, c2, c3]
    end
end

Amazing. Thankyou for your help

1 Like