Hello.
I’ve created a dataframe by reading a very long csv file.
Now I would like to create another smaller dataframe with the same columns but with a few rows sampled randomly (without replacement) from the original.
Say we have this toy example and we want to create a new one with 3 rows. I’ve tried several alternatives.
myDF = DataFrame(A = 1:10, B = 21:30)
sample(myDF, 3) # doesn't work
rand(myDF, 3) # doesn't work
randsubseq(myDF, 3) # doesn't work
myDF[sample(1:10,3),:] # doesn't work
myDF[rand(1:(size(myDF,1)),3),:] # it works
newDF = myDF[rand(1:(size(myDF,1)),3),:]
3×2 DataFrame
│ Row │ A │ B │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 2 │ 9 │
│ 2 │ 4 │ 7 │
│ 3 │ 10 │ 1 │
Is it the best way to do it?
How do you do it?