How do I create an array with random unique numbers in a specific range?

Hi everyone!

I have been googling and spending time in stackexchange but to no avail, I hope someone on here can help me out.
What I need is an array with 30 UNIQUE random integers between 1 and 40. Somehow this seems to be very complicated and I cannot do it for the life of me XD.
Does anyone have any suggestions how to go about that?

1 Like

randperm is what you want.

julia> using Random

julia> randperm(40)[1:30]
30-element Vector{Int64}:
 10
 25
 28
 23
  5
 36
 21
 16
 24
 39
  1
 19
 27
 40
  8
 15
 11
 17
 35
 34
 22
 38
  9
  4
  3
 12
  6
 29
  7
 14
5 Likes

Or shuffle

julia> shuffle(1:40)[1:30]
30-element Vector{Int64}:
  4
 17
  9
 25
 16
 31
  8
 29
 30
 33
 18
 35
 38
 26
 22
  1
 36
  6
 12
 14
 37
 10
 40
 27
 39
 21
  5
 20
 32
 19
6 Likes

edit to completely edit XD. Thanks so much, very helpful!

In order to draw 10 samples from 41:70, use shuffle(41:70)[1:10] or (41:70)[randperm(length(41:70))][1:10].

A short note on why this works in case it isn’t obvious:

shuffle takes the values it’s given (1 through 40) and puts them in a random order. If you now take the first 30 ([1:30]) you make sure that you get

  • 30 values (duh),

  • which are between 1 and 40 (because shuffle is given only those values),

  • and are unique because every value only exists once in the first place.

2 Likes

Worth to know in this context is how to make the random choice predictable with a choice of a seed. In order to achieve this in Julia it seems to be necessary to create an object and then use it. Below code demonstrating it:

using Random

# Initialize the RNG with a seed
seed = 12
rng = MersenneTwister(seed) # rng -> Random Number Generator

# Generate some random numbers
random_numbers = rand(rng, 5)      # Generate 5 random floats in range 0 to 1.0
random_permutation = randperm(rng, 9)  # Generate a random permutation of 1:9

println("Random numbers: ", random_numbers)
println("Random permutation: ", random_permutation)

The resulting output of this code is always the same because of the seed:

Random numbers: [0.25850961158904906, 0.9692536801431555, 0.4741774214537995, 0.4345063919322494, 0.965789950130105]
Random permutation: [3, 7, 4, 1, 8, 2, 9, 6, 5]
1 Like

If it matters, it might be more efficient to do:

using StatsBase
sample(1:40, 30; replace=false)
9 Likes

How to set the seed value while using StatsBase ?

Nothing wrong with making your own RNG object but if you’re looking for convenience you can use a global generator.

using Random
Random.seed!(13)
shuffle(1:40)[1:30]

There is no guarantee that this will give the same result between different Julia minor versions though. If that’s important for your use case you should look into the StableRNGs package.

3 Likes

I think Random.seed!() can be used as follows:

using StatsBase, Random

# using default rng:
Random.seed!(123)
sample(1:40, 30; replace=false)

# or using specific rng:
rng = MersenneTwister(17)
Random.seed!(rng, 123)
sample(rng, 1:40, 30; replace=false)
2 Likes

In other words StatsBase is built upon Random and can’t be initialized with a seed independent of the initialization of the Random package?

StatsBase has a lot of dependencies, including Random:
module StatsBase

import Base: length, size, isempty, values, sum, show, maximum, minimum, extrema
import Base.Cartesian: @nloops, @nref, @nextract
using Base: @irrational, @propagate_inbounds
using DataAPI
import DataAPI: describe
import DataStructures: heapify!, heappop!, percolate_down!
using SortingAlgorithms
using Missings
using LogExpFunctions: xlogx, xlogy

using Statistics
using LinearAlgebra
using Random
using Printf
using SparseArrays
import Random: rand, rand!
import LinearAlgebra: BlasReal, BlasFloat
import Statistics: mean, mean!, var, varm, varm!, std, stdm, cov, covm,
                   cor, corm, cov2cor!, unscaled_covzm, quantile, sqrt!,
                   median, middle
using StatsAPI: StatisticalModel, RegressionModel
import StatsAPI: pairwise, pairwise!, params, params!,
                 fitted, response, responsename, meanresponse, modelmatrix,
                 crossmodelmatrix, leverage, cooksdistance, residuals, predict,
                 predict!, dof_residual, coef, coefnames, coeftable, confint,
                 deviance, islinear, nulldeviance, loglikelihood, nullloglikelihood,
                 loglikelihood, loglikelihood, score, nobs, dof, mss, rss,
                 informationmatrix, stderror, vcov, weights, isfitted, fit, fit!,
                 aic, aicc, bic, r2, r², adjr2, adjr²
...
1 Like

Since 2023 googling and spending time on stackoverflow or other stackexchange sites and Q&A programming platforms is in many (simple like this one) cases much more time consuming and less fruitful than usage of LLMs.
Below the 1:1 exact chat session with ChatGPT model GPT-4o I had on this subject as evidence of the above statement ( click on the following link to get a well formatted web-page with the content listed below: ChatGPT ).
Please notice that the first code proposal provided by ChatGPT avoids the issue with creating more random numbers than necessary. In case of 30 from 40 not worth to mention, but in case of 300 from 40000000 the actual only right way to do it. In other words the answers given up to now on this platform missed to give the most general valid way of approaching what you are after. Remember:
Answers on a Q&A platform may be misleading or mistaken. Check important info and code yourself before using it.

ME
What I need is script code written in Julia programming language creating an array with 30 UNIQUE random integers between 1 and 40. 


ChatGPT
Sure, here's a Julia script to create an array with 30 unique random integers between 1 and 40:

julia
Copy code
# Import the Random module
using Random

# Function to generate an array with 30 unique random integers between 1 and 40
function generate_unique_random_integers()
    # Generate a set of unique random integers
    unique_integers = Set{Int}()
    
    # Keep adding random integers until we have 30 unique values
    while length(unique_integers) < 30
        new_int = rand(1:40)
        push!(unique_integers, new_int)
    end
    
    # Convert the set to an array
    unique_array = collect(unique_integers)
    
    return unique_array
end

# Call the function and print the result
unique_random_integers = generate_unique_random_integers()
println(unique_random_integers)
Explanation:
Import the Random module: This module provides functions for generating random numbers.
Function Definition: generate_unique_random_integers() is defined to generate the array.
Set for Unique Integers: A set is used to ensure all integers are unique.
While Loop: The loop runs until the set contains 30 unique integers.
Random Integer Generation: rand(1:40) generates a random integer between 1 and 40.
Convert Set to Array: The set is converted to an array to meet the requirement.
Function Call and Output: The function is called and the result is printed.
This script ensures that the array contains 30 unique random integers between 1 and 40.

If you are not satisfied with the provided solution you may request another one:

ME
How would the code look like if the Random module would perform a random permutation of a sequence 1:40 taking from it the first 30 items only? Would the result be well distributed or will such approach have some bias?


ChatGPT
Using the Random module to perform a random permutation of a sequence and then taking the first 30 items is a good approach. This method will be well-distributed without bias, as the permutation ensures each element has an equal chance of being in any position. Here is how you can do it in Julia:

julia
Copy code
# Import the Random module
using Random

# Function to generate an array with 30 unique random integers between 1 and 40 using permutation
function generate_unique_random_integers_via_permutation()
    # Create a sequence from 1 to 40
    sequence = 1:40
    
    # Perform a random permutation of the sequence
    permuted_sequence = randperm(sequence)
    
    # Take the first 30 items from the permuted sequence
    unique_array = permuted_sequence[1:30]
    
    return unique_array
end

# Call the function and print the result
unique_random_integers = generate_unique_random_integers_via_permutation()
println(unique_random_integers)
Explanation:
Import the Random module: Necessary for random number generation.
Function Definition: generate_unique_random_integers_via_permutation() generates the array.
Create Sequence: A sequence from 1 to 40 is created.
Random Permutation: randperm(sequence) generates a random permutation of the sequence.
Select First 30 Items: The first 30 items from the permuted sequence are selected.
Function Call and Output: The function is called and the result is printed.
Distribution:
This approach uses the randperm function, which shuffles the sequence randomly. As each element of the sequence has an equal probability of appearing in any position, the result is well-distributed without bias. This method ensures that every possible combination of 30 unique integers from the set of 40 is equally likely.

Please notice that the quality of the question statement text is very important as the response can only be as good as the question is. Don’t forget to provide all necessary details and the entire context if asking a question - remember: the higher the quality of the question the higher the quality of the LLM response.

And YES … the LLM response may be code which does not run at all, is misleading and wrong or while running causing harm to your system.
But … the same, even if less probably, is also true for responses got on a Q&A platform and while using Google.

This is Julia discourse and not stackoverflow where you are expected to provide evidence of own efforts towards getting an answer to a question in order to avoid down-voting (there is no down-voting here) or closing. Feel free just to ask a question without mentioning what you have tried yourself if it is not required for increasing clarity what the question asks for.
Please also be aware that since 2023 LLMs are capable of providing code in any programming language with voluminous and detailed explanations you can’t expect from a human respondent.

No they didn’t miss it. The most efficient way was given by @rafael.guerra using StatsBase:

The first answer given by ChatGPT is extremely inefficient because it generates the random numbers one by one and repeatedly pushes them into a set. The second answer is the same given by the helpful human people on this forum, just more verbose.

Let’s also remember that LLMs are notoriously worse on less popular programming languages like Julia. And sometimes, people just want an answer, not a voluminous and detailed explanation.

I think most people who come here are aware of the existence of ChatGPT. And perhaps if they come here, it’s because they want something they cannot get out of ChatGPT: the advice of actual people, sometimes world-class experts, with specific Julia programming experience.
However this focused thread is not the place to debate the merits of LLMs, so I’ll stop here, and I invite you to do the same.

13 Likes

Yes, exactly that. I dont really want to use LLMs, especially when I am using a programming language that I dont know so well yet. Its hard for me to judge if the LLMs answer would be the most efficient or even correct given how little experience I have using Julia.
In general, I am not a big fan of ChatGPT but thats another topic that doesnt belong here.

5 Likes

I believe the first answer also fails to ensure the entries are unique, other than wishfully naming the function. EDIT: Missed use of Set pointed out by @stevengj so I was wrong!

Note that wishful naming is unique to humans among primates and other animals. LLMs may indeed be approaching human intelligence!

No, the Set data structure guarantees that its elements are unique.

2 Likes
julia> @time sample(1:100000, 100, replace = false);
  0.000013 seconds (8 allocations: 3.602 KiB)

julia> @time randperm(100000)[1:100];
  0.000768 seconds (3 allocations: 782.172 KiB)

julia> @time shuffle(1:100000)[1:100];
  0.000813 seconds (3 allocations: 782.172 KiB)
1 Like