Hi guys,
I am trying to permute/shuffle a Vector of Vectors given a certain index. Some vectors can occur several times, so I need a buffer container for no allocations. MWE:
vals = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ]
buffer = [ [0,0,0], [0,0,0], [0,0,0] ]
index = [2, 1, 1]
@inline function _shuffle!(vals, buffer, index)
# First assign new vector order in buffer
@inbounds @simd for iter in eachindex(vals)
buffer[iter] .= vals[index[iter]]
end
# Then input in vals
@inbounds @simd for iter in eachindex(vals)
vals[iter] .= buffer[iter]
end
return nothing
end
_shuffle!(vals, buffer, index)
vals #now has form [[4, 5, 6], [1, 2, 3], [1, 2, 3]]
Now, I need th .=
copy syntax as otherwise I get issues with referencing, and I figured I could just write out the loop completely to get rid of that. MWE:
@inline function _shuffle2!(vals, buffer, index)
# First sort new order in buffer
MaxIter = length(vals[1])
@inbounds @simd for iter in eachindex(vals)
index_current = index[iter]
@inbounds @simd for idx in Base.OneTo(MaxIter)
buffer[iter][idx] = vals[index_current][idx]
end
end
# Then input in vals
@inbounds @simd for iter in eachindex(vals)
@inbounds @simd for idx in Base.OneTo(MaxIter)
vals[iter][idx] = buffer[iter][idx]
end
end
return nothing
end
vals = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ]
buffer = [ [0,0,0], [0,0,0], [0,0,0] ]
index = [2, 1, 1]
_shuffle2!(vals, buffer, index)
vals #now has form [[4, 5, 6], [1, 2, 3], [1, 2, 3]]
Now, I assumed that _shuffle2! has to be much faster in this case, but its considerably slower:
using BenchmarkTools
Ncols = 1000
Nrows = 1000
vals = [ rand(1:10, Ncols) for _ in Base.OneTo(Nrows)]
buffer = [ rand(1:10, Ncols) for _ in Base.OneTo(Nrows)]
index = rand(1:Ncols, Nrows)
@btime _shuffle!($vals, $buffer, $index) # 693.400 μs (0 allocations: 0 bytes)
@btime _shuffle2!($vals, $buffer, $index) # 1.218 ms (0 allocations: 0 bytes)
Now my question is, why? Is there a way to make the unenrolled loop faster?