# Performance of Cartesian Indices and Views

Hello all,

I have a some code that uses Cartesian Indices that slows down heavily when views are involved. The slowdown is dimension dependent, I was able to figure out that it is because in some cases things inline/unroll correctly and for higher dimensions they don’t. Any hints on why this is happening or how to resolve it would be great!

MWE:

``````using BenchmarkTools
BenchmarkTools.DEFAULT_PARAMETERS.seconds=1.
function foo2(A)
n1,n2=size(A)
s=0.
for i2 in 2:n2-1
@simd for i1 in 2:n1-1
@inbounds s+=A[i1+1,i2]+A[i1-1,i2]+A[i1,i2-1]+A[i1,i2+1]
end
end
s
end

function foo3(A)
n1,n2,n3=size(A)
s=0.
for i3 in 2:n3-1
for i2 in 2:n2-1
@simd for i1 in 2:n1-1
@inbounds s+=A[i1+1,i2,i3]+A[i1-1,i2,i3]+A[i1,i2-1,i3]+A[i1,i2+1,i3]+
A[i1,i2,i3-1]+A[i1,i2,i3+1]
end
end
end
s
end

@inline I1dim(dim,ndims)=CartesianIndex(ntuple(i->ifelse(i==dim,1,0) ,Val{ndims}()))
function fooCI(A::AbstractArray{T,nd}) where {T,nd}
#nd=ndims(A)
in_ind=ntuple(i->2:size(A,i)-1,Val{nd}())
Rin=CartesianIndices(in_ind)
I1s=ntuple(i->I1dim(i,nd),Val{nd}())

s=zero(T)
f(A,I,I1s,i)=(@inbounds A[I+I1s[i]]+A[I-I1s[i]])
@simd for I in Rin
@inbounds s+=sum(ntuple(i->f(A,I,I1s,i),Val{nd}()))
end
s
end

A=rand(100,100)
vA=view(A,:,:)
@btime fooCI(\$A)
@btime fooCI(\$vA)
@btime foo2(\$A)
@btime foo2(\$vA)

1.974 μs (0 allocations: 0 bytes)
2.120 μs (0 allocations: 0 bytes)
1.991 μs (0 allocations: 0 bytes)
2.185 μs (0 allocations: 0 bytes)

A=rand(10,10,10)
vA=view(A,:,:,:)
@btime fooCI(\$A)
@btime fooCI(\$vA)
@btime foo3(\$A)
@btime foo3(\$vA)

353.519 ns (0 allocations: 0 bytes)
5.079 μs (0 allocations: 0 bytes)
321.382 ns (0 allocations: 0 bytes)
362.574 ns (0 allocations: 0 bytes)
``````

As you see, the slowdown is in 3D but not in 2D. Also if you force `f` to inline (`@inline f` in `fooCI`) the code starts to allocate and the slowdown is worse!

I appreciate any help,

Cheers!

1 Like