I have a tuple of heterogeneous structures:
struct Foo{T}
a :: T
b :: T
end
foos = (Foo(1,2), Foo(3.0,4.0))
and I have a 3D grid which is separated into different regions such that in a given region I want to use a specific structure from the tuple. To do so, I create a 3D array of integers with the serial numbers of the structures:
I = zeros(Int, (100,200,300))
@. I[1:50,:,:] = 1
@. I[51:100,:,:] = 2
such that being at the grid point i,j,k
I can access the desired structure as foos[I[i,j,k]]
. The MWE code is the following:
function bar1(foos, I)
@inbounds for i in eachindex(I)
foo = foos[I[i]]
end
return nothing
end
function bar2(foos, I)
@inbounds for i in eachindex(I)
foo = I[i] == 1 ? foos[1] : foos[2]
end
return nothing
end
However, the bar1
function is much slower than bar2
:
@btime bar1($foos, $I) # 145.988 ms (12000000 allocations: 457.76 MiB)
@btime bar2($foos, $I) # 1.497 ns (0 allocations: 0 bytes)
How can I fix the first function in order to have the fast and generic code?
The full code
using BenchmarkTools
struct Foo{T}
a :: T
b :: T
end
foos = (Foo(1,2), Foo(3.0,4.0))
I = zeros(Int, (100,200,300))
@. I[1:50,:,:] = 1
@. I[51:100,:,:] = 2
function bar1(foos, I)
@inbounds for i in eachindex(I)
foo = foos[I[i]]
end
return nothing
end
function bar2(foos, I)
@inbounds for i in eachindex(I)
foo = I[i] == 1 ? foos[1] : foos[2]
end
return nothing
end
@btime bar1($foos, $I)
@btime bar2($foos, $I)