A question about how arrays work, how memory is allocated and what happen when chunks of code inside a function are moved into another function

This is a type stability problem. What you want is

struct C3D{P, T<:Number}
  Nx::NTuple{P, Array{T,1}}
  Ny::NTuple{P, Array{T,1}}
  Nz::NTuple{P, Array{T,1}}
  wgt::NTuple{P, T}  
end

Otherwise every access to a field of elem is type unstable. The reason your second version is faster is because it adds a function barrier so accumulate! is type stable. With this change (and the corresponding change to the constructor), getϕ_a takes 253.749 μs (4 allocations: 6.67 KiB)

6 Likes