I have a compute-heavy function kernel that I am trying to use inside another function. This function takes several arrays and other data as arguments, and I was trying to make the code cleaner by passing an object as argument, but the performance drops markedly when I do so. Here’s a quick code for demonstration.

```
# Passing all data as explicit arguments:
function compute1(nx, ny, nz, arr1, arr2)
for k in 1:nz, j in 1:ny, i in 1:nx
tmp = arr1[i, j, k] * arr2[i, j, k]
tmp2 = tmp + arr1[i, j, k]
end
end
function start_compute1()
nx = 10
ny = 10
nz = 200
arr1 = zeros(Float64, (nx, ny, nz))
arr2 = ones(Float64, (nx, ny, nz))
compute1(nx, ny, nz, arr1, arr2)
end
@benchmark start_compute1()
BenchmarkTools.Trial:
memory estimate: 312.66 KiB
allocs estimate: 4
--------------
minimum time: 39.613 μs (0.00% GC)
median time: 157.446 μs (0.00% GC)
mean time: 192.640 μs (20.39% GC)
maximum time: 6.807 ms (97.69% GC)
--------------
samples: 10000
evals/sample: 1
```

This is the version that performs. My first attempt at passing a object with the data was with a dicionary:

```
function compute2(data)
nx, ny, nz = data[:nx], data[:ny], data[:nz]
arr1 = data[:arr1]
arr2 = data[:arr2]
for k in 1:nz, j in 1:ny, i in 1:nx
tmp = arr1[i, j, k] * arr2[i, j, k]
tmp2 = tmp + arr1[i, j, k]
end
end
function start_compute2()
nx = 10
ny = 10
nz = 200
arr1 = zeros(Float64, (nx, ny, nz))
arr2 = ones(Float64, (nx, ny, nz))
data = Dict()
data[:nx], data[:ny], data[:nz] = nx, ny, nz
data[:arr1] = arr1
data[:arr2] = arr2
compute2(data)
end
@benchmark start_compute2()
BenchmarkTools.Trial:
memory estimate: 2.58 MiB
allocs estimate: 124409
--------------
minimum time: 2.178 ms (0.00% GC)
median time: 2.327 ms (0.00% GC)
mean time: 2.593 ms (10.94% GC)
maximum time: 8.116 ms (70.25% GC)
--------------
samples: 1928
evals/sample: 1
```

Seems like the problem is that it is not optimising for what is inside the `Dict`

. I have also tried to use a `struct`

with a well-defined data type instead of `Dict`

, or including type annotations for `arr1`

and `arr2`

(e.g. `arr1 = data[:arr1]::Array{Float64, 3} `

), but the problem persists.

Is there a way to recover the performance without having to spell out all individual arguments into the `compute2`

function? Perhaps there is an obvious solution, but I’m not finding it.

Any suggestions would be appreciated.