If you run @code_warntype test(rand(256, 256, 2)) you’ll see a lot of type-instabilities:
julia> @code_warntype test(rand(256, 256, 2))
Variables
#self#::Core.Compiler.Const(test, false)
a::Array{Float64,3}
stats::Base.GC_Num
elapsedtime::UInt64
val::Nothing
diff::Base.GC_Diff
σ::Any
R::CartesianIndex{_A} where _A
o::CartesianIndex{_A} where _A
@_10::Any
I::Any
Body::Any
Those ultimately stem from size(a)[1:end-1], which takes a tuple and then slices it into a vector. The problem here is that the type of CartesianIndex depends on the number of dimensions, but the length of that slice of size(a) is not known to the compiler (it depends on the run-time value of end-1).
Fixing that removes all the allocations:
julia> function test(a)
σ = 0.
R = CartesianIndex((size(a, 1), size(a, 2))); o = oneunit(R)
@time for I ∈ 2o:R
σ += a[I,1]-a[I-o,2]^2 # actually, a more complex function...
end
return σ
end
julia> test(rand(256,256,2));
0.000243 seconds
The only issue here is that I’ve implicitly assumed that a has 3 dimensions, which your original code did not. We can make a sufficiently general version with a helper function:
function all_but_last(t::NTuple{N, T}) where {N, T}
ntuple(i -> t[i], N - 1)
end
and use it like so:
julia> function test(a)
σ = 0.
R = CartesianIndex(all_but_last(size(a))); o = oneunit(R)
@time for I ∈ 2o:R
σ += a[I,1]-a[I-o,2]^2 # actually, a more complex function...
end
return σ
end
test (generic function with 1 method)
julia> test(rand(256,256,2));
0.000218 seconds
The moral is that the length of a tuple is part of its type, so be wary of operations that might accidentally throw away that information by converting to a Vector.