# Crazy allocations using CartesianIndices

I have been switching over a 2D flux computation method to use CartesianIndices so it will work for arrays of any dimension, ala Multidimensional algorithms and iteration

Unfortunately, I’ve been running into a bunch of memory allocation “gotchas” and I don’t understand what I need to do to avoid them. Here’s an example

``````function test(a)
σ = 0.
R = CartesianIndex(size(a)[1:end-1]); o = oneunit(R)
@time for I ∈ 2o:R
σ += a[I,1]-a[I-o,2]^2 # actually, a more complex function...
end
return σ
end
``````

But when I run this I get slow code with a giant memory issue

``````julia> test(rand(256,256,2));
0.046946 seconds (585.23 k allocations: 13.891 MiB)
``````

Please, could someone let me know:

1. What is going on here?
2. How am I supposed to do this elegantly?

If you run `@code_warntype test(rand(256, 256, 2))` you’ll see a lot of type-instabilities:

``````julia> @code_warntype test(rand(256, 256, 2))
Variables
#self#::Core.Compiler.Const(test, false)
a::Array{Float64,3}
stats::Base.GC_Num
elapsedtime::UInt64
val::Nothing
diff::Base.GC_Diff
σ::Any
R::CartesianIndex{_A} where _A
o::CartesianIndex{_A} where _A
@_10::Any
I::Any

Body::Any
``````

Those ultimately stem from `size(a)[1:end-1]`, which takes a tuple and then slices it into a vector. The problem here is that the type of `CartesianIndex` depends on the number of dimensions, but the length of that slice of `size(a)` is not known to the compiler (it depends on the run-time value of `end-1`).

Fixing that removes all the allocations:

``````julia> function test(a)
σ = 0.
R = CartesianIndex((size(a, 1), size(a, 2))); o = oneunit(R)
@time for I ∈ 2o:R
σ += a[I,1]-a[I-o,2]^2 # actually, a more complex function...
end
return σ
end
``````
``````julia> test(rand(256,256,2));
0.000243 seconds
``````

The only issue here is that I’ve implicitly assumed that `a` has 3 dimensions, which your original code did not. We can make a sufficiently general version with a helper function:

``````function all_but_last(t::NTuple{N, T}) where {N, T}
ntuple(i -> t[i], N - 1)
end
``````

and use it like so:

``````julia> function test(a)
σ = 0.
R = CartesianIndex(all_but_last(size(a))); o = oneunit(R)
@time for I ∈ 2o:R
σ += a[I,1]-a[I-o,2]^2 # actually, a more complex function...
end
return σ
end
test (generic function with 1 method)

julia> test(rand(256,256,2));
0.000218 seconds
``````

The moral is that the length of a tuple is part of its type, so be wary of operations that might accidentally throw away that information by converting to a `Vector`.

19 Likes

As you say, I need the code to work regardless of dimension, but your helper functions works great. However, I’m a little confused since this slicing was used in the blog post

``````function expfiltdim(x, dim::Integer, α)
s = similar(x)
Rpre = CartesianIndices(size(x)[1:dim-1])
Rpost = CartesianIndices(size(x)[dim+1:end])
_expfilt!(s, x, α, Rpre, size(x, dim), Rpost)
end
``````

Would helpers like this avoid the issues even though `dim` is a variable?

``````function before_dim(dim,t::NTuple{N, T}) where {N, T}
ntuple(i -> t[i], dim - 1)
end
function after_dim(dim,t::NTuple{N, T}) where {N, T}
ntuple(i -> t[dim+i], N - dim)
end
``````

The example there has the same type-instability, but it doesn’t cause a problem because `_expfilt!` is creating a “function barrier”. Even though the type of `Rpre` in `expfiltdim` cannot be inferred, the `_expfilt!` function still receives a value with a concrete type, so within the body of `_expfilt!` there is no type-instability. Check out the section of the manual I linked for more details. You could get the same benefit by moving your loop into its own dedicated function, although I personally think fixing the instability altogether is good practice when possible.

It’s hard to say–it depends on how clever the compiler is with its constant propagation. If you’re dealing with cases like this, then the function barrier approach seems like a good idea.

6 Likes

By the way, I don’t see any instabilities with Julia 1.5.0-beta1.0, on what version did you test your code:

``````julia> @code_warntype test(rand(256, 256, 2))
Variables
#self#::Core.Compiler.Const(test, false)
a::Array{Float64,3}
stats::Base.GC_Num
elapsedtime::UInt64
val::Nothing
diff::Base.GC_Diff
σ::Float64
R::CartesianIndex{2}
o::CartesianIndex{2}
@_10::Union{Nothing, Tuple{CartesianIndex{2},CartesianIndex{2}}}
I::CartesianIndex{2}

Body::Float64
. . .
``````
1 Like

It’s been improved on 1.5, all prior versions fail to infer it. It’s worth noting that there are still some tuple-indexing behaviors that aren’t inferred correctly, but I think it’s fair to say that all the most common ones are.

5 Likes

That sounds like it could save me a bunch of memory headaches. Is that release stable?

Since I’m trying to make this a package other people can use with other versions, I am also a bit nervous solving the problem this way.

If you want to support multiple Julia versions, you’ll have to handle it the “hard way.” Aside from `ntuple`, there’s `Base.tail`, `Base.front`, and `@inline`d splatting. I can’t pull up memory of a blog post on “lispy tuple programming,” anyone have suggestions?

1 Like

IMHO Julia is improving fast and for a brand new package like yours, I do not see why you should pay the price with compatibility with old versions… Of course it is totally up to you.

I have spent some time reading your code which is extremely elegant and concise. Coming from C++, I am still amazed that such a dimension agnostic coding style can be obtained at zero cost. If it is the case with Julia 1.5, it should be kept like this.

I also wonder about, the future adaption to GPU. It could be interesting to see if construction like @vchuravy https://github.com/JuliaGPU/KernelAbstractions.jl could GPUify your code efficiently (although I think that this package is not yet compatible with Julia 1.5 nor 1.4).

3 Likes

The tricks at the end of

cover a lot of cases.

Doing the exercises is recommended, and automatically confers a green belt to the user.

2 Likes I used to lead the MIT TKD club, so this is full circle.

2 Likes

Ok, I’m going to update to 1.5 (downloaded already) since I don’t enjoy tracking down all these memory thing.

However, I have one more chestnut which I would like to offer up to the hive mind. I still can’t get the function barrier approach to work for applying boundary conditions

``````function _apply!(a,dim::Integer,left,right,A)
for i ∈ left, j ∈ right
a[i,size(a,dim),j, dim] = A
end
end
function test(a::Array{T,N},A) where {T,N}
for dim ∈ 1:N-1
left = CartesianIndices(size(a)[1:dim-1])
right = CartesianIndices(size(a)[dim+1:N-1])
@time _apply!(a,dim,left,right,A)
end
return a
end
julia> test(ones(20,20,20,3),0.);
0.000299 seconds (2.80 k allocations: 93.766 KiB)
0.000003 seconds (1 allocation: 16 bytes)
0.000002 seconds (1 allocation: 16 bytes)
``````

If I try the `Val{dim}` trick from that ipynb tutorial, I get

``````function _apply!(a,::Val{dim},left,right,A) where dim
for i ∈ left, j ∈ right
a[i,size(a,dim),j, dim] = A
end
end
_apply!(a,dim::Integer,left,right,A) = (a,Val{dim}(),left,right,A)

julia> test(ones(20,20,20,3),0.);
0.000026 seconds (4 allocations: 112 bytes)
0.000048 seconds (5 allocations: 112 bytes)
0.000092 seconds (4 allocations: 112 bytes)
``````

The @code_typed is also a mess for a fairly simple idea: `a[:,size(a,2),:,2]=A` etc. Maybe it would be better to do this with code generation?

Generic splits are not handled well yet. There is `Base.IteratorsMD.split`, in case it’s useful:

``````julia> I = CartesianIndices((5,10,15,20)).indices
(Base.OneTo(5), Base.OneTo(10), Base.OneTo(15), Base.OneTo(20))

julia> pre, post = Base.IteratorsMD.split(I, Val(2));

julia> pre
(Base.OneTo(5), Base.OneTo(10))

julia> post
(Base.OneTo(15), Base.OneTo(20))
``````
1 Like

I couldn’t get this to work satisfactorily for the BCs, but the rest of the solver is nicely dimensionally independent. Here’s a slice through 3D Taylor Green Vortex. 7 Likes

I know this was long ago and far away, but I looked at it again this weekend and came up with a good solution

``````function slice(N::NTuple{n,Int},s::Int,dims::Int)::CartesianIndices{n} where n
CartesianIndices(ntuple( i-> i==dims ? (s:s) : (1:N[i]), n))
end
``````

This let’s me write general multi-dimensional code applied to the boundaries of an array; like setting boundary conditions:

``````function BC!(a::Array{T,n}) where {T,n}
N = size(a)
for j ∈ 1:n
@simd for I ∈ slice(N,1,j)
a[I] = a[I+δ(j,n)] # Neumann
end
@simd for I ∈ slice(N,N[j],j)
a[I] = a[I-δ(j,n)] # Neumann
end
end
end
@inline δ(a,d::Int) = CartesianIndex(ntuple(i -> i==a ? 1 : 0, d))
``````
4 Likes