I need to access a matrix millions of times, and I noticed that most of the allocations of my code come exactly from this. My code is similar to the following example

const A = rand(10,10)
function get_number(idx)
@inbounds getindex(A, idx...)
end

Say I need to access index (4,6) 100000 times

idx = [4,6]
@time begin
for i in 1:100000
get_number(idx)
end
end
0.011408 seconds (200.78 k allocations: 3.099 MiB, 16.23% compilation time)

This, as you can see, allocates 3 MiB. Each call makes 2 16bytes allocations of

Splatting seems to be the culprit here. The following

using BenchmarkTools
const A = rand(10,10)
function get_number(idx1, idx2)
@inbounds getindex(A, idx1, idx2)
end
function test()
for i in 1:100000
idx = (rand(1:10),rand(1:10))
get_number(idx[1], idx[2])
end
end
@btime test()

Never mind, indexing with a vector instead of a tuple is the problem. The following works, too:

using BenchmarkTools
const A = rand(10,10)
function get_number(idx...)
@inbounds getindex(A, idx...)
end
function test()
for i in 1:100000
idx = (rand(1:10),rand(1:10))
get_number(idx...)
end
end
@btime test()

In my case, n is a const. It can a number between 2 to say 5 at most, but it is fixed at the beginning and never changes during the experiment. I tried to do the following:

const A = rand(10,10)
function get_number(idx)
@inbounds getindex(A, (idx[1],idx[2])...)
end

And this largely solves the problem: allocations drop from 14.5Gb to 500Mb and the code is more than 2x faster. However, this does not work as sood as idx is of size 3 or more.

It does work as intended: as @goerch pointed out, the problem was that I was indexing with a vector rather than a tuple. Thus, my problem has shifted: I need a non-allocating way to convert my idx array to a tuple. For the moment, this workaround seems to do just fine (but it is ugly):

function get_tuple(idx)
if n == 2
tup = (idx[1], idx[2])
elseif n == 3
tup = (idx[1], idx[2], idx[3])
elseif n == 4
tup = (idx[1], idx[2], idx[3], idx[4])
elseif n == 5
tup = (idx[1], idx[2], idx[3], idx[4], idx[5])
end
return tup
end