I have a function that loops over a 2D array. For the second dimension only some of the indices are used, and they are passed in as a unit range. It seems to make a big difference to performance if the range is part of a struct, does anyone know what cause the difference?
using BenchmarkTools
struct MyStruct
inds :: UnitRange
end
A = rand(25000,4)
inds = 1:3
str = MyStruct(inds)
function f1(A, inds)
for i in inds
for k = axes(A,1)
A[k,i] = A[k,i] ^ 2
end
end
end
function f2(A, str)
inds = str.inds
for i in inds
for k = axes(A,1)
A[k,i] = A[k,i] ^ 2
end
end
end
@btime f1($A,$inds)
@btime f2($A,$str)
The problem is that the type of inds in MyStruct is not fully specified. If you would use ::UnitRange{Int64} or generics, you get the same performance:
Presumably the view helps because it moves some bounds checks out of the loop:
function f1ib(A, inds)
for i in inds
for k = axes(A,1)
@inbounds A[k,i] = A[k,i] ^ 2
end
end
end
@btime f1ib($A, $inds) # 10.000 μs (0 allocations: 0 bytes)
@btime f3($A, $inds) # 10.100 μs (0 allocations: 0 bytes)