# Best way to shuffle elements in a vector of vectors

I have some code that needs to shuffle up the elements of a vector of vectors the way I have this coded feels terrible and I was hoping some advice could come on how to make this better. The key is that the data is stored as a vector of “rows” all of equal size, but I want to shuffle the “columns” of this but return back the original structure of vector of vectors.

``````data = [1:5 for i in 1:20]
stacked = hcat(data...)
hcat([shuffle(stacked[:, i]) for i in 1:size(stacked, 2)]...)
``````

Thanks so much for any tips!

Just one idea based on RecursiveArrayTools.jl:

``````using RecursiveArrayTools

function myshuffle!(data)
v = VectorOfArray(data)
for k in 1:length(v)
shuffle!(v[:,k])
end
nothing
end
``````

which would give

``````julia> data = [collect(1:5) for i in 1:20];

julia> myshuffle!(data);

julia> data
20-element Array{Array{Int64,1},1}:
[1, 5, 4, 3, 2]
[3, 4, 2, 1, 5]
[3, 2, 5, 4, 1]
[4, 3, 5, 1, 2]
[5, 1, 4, 3, 2]
[3, 2, 5, 4, 1]
[1, 4, 3, 2, 5]
[1, 2, 4, 5, 3]
[1, 3, 2, 4, 5]
[4, 3, 1, 5, 2]
[5, 2, 4, 1, 3]
[4, 2, 1, 5, 3]
[3, 5, 4, 2, 1]
[5, 2, 1, 3, 4]
[3, 4, 1, 2, 5]
[4, 3, 2, 5, 1]
[2, 1, 5, 4, 3]
[4, 2, 3, 1, 5]
[4, 3, 5, 2, 1]
[2, 3, 4, 1, 5]
``````

This should all be lazy/in-place.

UPDATE:

``````julia> @btime myshuffle!(\$data);
1.536 μs (0 allocations: 0 bytes)
``````
1 Like

If vector-of-vectors is the structure you want then you can do this:

``````julia> using Random

julia> data = [collect(1:5) for i in 1:10];

julia> foreach(shuffle!, data)

julia> data
10-element Array{Array{Int64,1},1}:
[3, 2, 5, 1, 4]
[1, 2, 4, 5, 3]
[2, 3, 1, 5, 4]
[2, 4, 5, 3, 1]
[2, 3, 4, 5, 1]
[3, 1, 4, 5, 2]
[1, 4, 5, 3, 2]
[5, 3, 1, 4, 2]
[4, 5, 3, 2, 1]
[3, 2, 4, 5, 1]
``````

If you’d prefer to have a matrix with shuffled rows or columns, you can use views:

``````julia> data = [j for i = 1:10, j = 1:5];

julia> for i = 1:size(data, 1)
shuffle!(@view data[i,:])
end

julia> data
10×5 Array{Int64,2}:
2  5  3  4  1
4  2  1  5  3
4  5  3  2  1
1  4  2  3  5
5  2  1  3  4
1  4  3  2  5
4  3  5  2  1
2  4  3  5  1
3  1  2  5  4
3  1  4  2  5
``````
3 Likes

I just realize that using `VectorOfArrays` is completely unnecessary… In my defense, it’s Friday evening for me

1 Like

Is this not just shuffling each of the subvectors (i.e. the so called rows)? The issue with what I am doing is that I need to shuffle along the “columns” of the stacked data so I need to have a shuffling on `data[i][j] for each j` hence all the horrible hcat(data…) stuff.

1 Like

So you want this?

``````julia> using Random

julia> data = [collect(i:i+4) for i in 1:5:50];

julia> data
10-element Array{Array{Int64,1},1}:
[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]
[11, 12, 13, 14, 15]
[16, 17, 18, 19, 20]
[21, 22, 23, 24, 25]
[26, 27, 28, 29, 30]
[31, 32, 33, 34, 35]
[36, 37, 38, 39, 40]
[41, 42, 43, 44, 45]
[46, 47, 48, 49, 50]

julia> front_=[data[i][1] for i=1:length(data)];

julia> shuffle!(front_);

julia> for i=1:length(data) data[i][1] = front_[i] end;

julia> data
10-element Array{Array{Int64,1},1}:
[6, 2, 3, 4, 5]
[46, 7, 8, 9, 10]
[36, 12, 13, 14, 15]
[11, 17, 18, 19, 20]
[41, 22, 23, 24, 25]
[31, 27, 28, 29, 30]
[16, 32, 33, 34, 35]
[26, 37, 38, 39, 40]
[21, 42, 43, 44, 45]
[1, 47, 48, 49, 50]

``````
1 Like

I see my test code had a bug needed to be with `shuffle(stacked[:, i])`, will fix up the question, and so far the recursive arrays are nice (should have known … as I implemented some of that for just this reason …)

yes this kind of thing but for all of the “columns” Much better example … as my [1:5 for i = 1:20] is really dumb in this case as I it would be unchanged for what I want as each column is equal. Always a trick to properly test your minimum example from a much larger code base!

Okay with everyone’s comments here is the new best solutions I can do (that is actually correct unlike my first example …).

``````using RecursiveArrayTools
using BenchmarkTools

function shuffle_col!(data)
vdata = VectorOfArray(data)
for i = 1:size(vdata, 1)
shuffle!(@view vdata[i, :])
end
return
end

data = [fill(i, 5) for i in 1:10]
@btime shuffle_col!(data)
``````

Anyone see anything I could do to make it faster / cleaner? Thanks so much!

Also is it the correct style to return nothing from an inplace operation?

I prefer the explict: `return nothing`. Nothing is an appropriate return from an inplace operation, sometimes there is reason to return the changed thing explicitly, too.

1 Like

Instead of VectorOfArray(data) cannot you just create a matrix?

[data[i][j] for i=1:length(data),j=1:length(data[1])]

After shuffling, you can convert it back to vector of vectors:

[vdata[i,:] for i=1:size(vdata)[1]]

To make it for columns instead of rows in Stefan’s second example, just use `[:,i]`. Also, use `\$` before variables with `@btime` to get proper timings.

2 Likes