I’ve tracked it down. Indeed, I’m seeing a 3x slowdown for this “trivialized” loop function.
function increment!(p::AbstractArray,a::AbstractArray)
for I ∈ CartesianIndices(p)
@inbounds p[I] += a[I]
end
end
n = 32
@benchmark increment!(a,b) setup=(a=rand(n,n,n);b=rand(n,n,n))
This gives a 3x slow down from Julia 1.5 to 1.6. The amount of slowdown does depend on n
: The slowdown is only 1.7x when n=64
and is 4.3x when n=16
. Using 2D arrays with the same total memory shows a similar slow down, but using 1D arrays makes this slowdown vanish.