Why does indexing into CartesianIndices lead to excessive runtime dispatch?

mkitti · February 20, 2024, 7:46am

The problem is the inability to infer the type of the output given the type of the input.

Right now the input type is Int but the output type is unclear since it depends on the value of the argument rather than its type.

Rather than passing an Int you may want to pass a Val where the dimensionality is encoded in the type.
Rather than using fill consider using ntuple with a Val argument.

I will edit this later with concrete examples.

First let’s look at type inferrence.

julia> @code_warntype neighbors(3)
MethodInstance for neighbors(::Int64)
  from neighbors(d::Int64) @ Main REPL[20]:1
Arguments
  #self#::Core.Const(neighbors)
  d::Int64                                                        Locals
  n::Any
  n_::CartesianIndices{N, R} where {N, R<:Tuple{Vararg{UnitRange{Int64}, N}}}
Body::Any
1 ─ %1  = (-1:1)::Core.Const(-1:1)
│   %2  = Main.fill(%1, d)::Vector{UnitRange{Int64}}
│   %3  = Core._apply_iterate(Base.iterate, Core.tuple, %2)::Tuple{Vararg{UnitRange{Int64}}}                                        │         (n_ = Main.CartesianIndices(%3))
│   %5  = n_::CartesianIndices{N, R} where {N, R<:Tuple{Vararg{UnitRange{Int64}, N}}}
│   %6  = Base.firstindex(n_)::Any
│   %7  = Base.lastindex(n_)::Integer
│   %8  = (%7 // 2)::Rational
│   %9  = (%6:%8)::Any
│   %10 = Base.getindex(%5, %9)::Any                              │         (n = %10)
└──       return %10

If you run this yourself you will see a lot of red text indicating type instability. Ultimately, Julia gave up figuring out the return type and just went with Any.

If we implement my suggestions, the situation will be greatly improved.

julia> function neighbors(v::Val{d}) where d
           n_ = CartesianIndices(ntuple(_->-1:1,v))
           return n_[begin:end÷2]
       end
neighbors (generic function with 2 methods)

julia> @code_warntype neighbors(Val(3))
MethodInstance for neighbors(::Val{3})
  from neighbors(v::Val{d}) where d @ Main REPL[28]:1
Static Parameters
  d = 3
Arguments
  #self#::Core.Const(neighbors)
  v::Core.Const(Val{3}())
Locals
  #5::var"#5#6"
  n_::CartesianIndices{3, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}}
Body::Vector{CartesianIndex{3}}
1 ─       (#5 = %new(Main.:(var"#5#6")))
│   %2  = #5::Core.Const(var"#5#6"())
│   %3  = Main.ntuple(%2, v)::Core.Const((-1:1, -1:1, -1:1))
│         (n_ = Main.CartesianIndices(%3))
│   %5  = n_::Core.Const(CartesianIndices((-1:1, -1:1, -1:1)))
│   %6  = Base.firstindex(n_::Core.Const(CartesianIndices((-1:1, -1:1, -1:1))))::Core.Const(1)
│   %7  = Base.lastindex(n_::Core.Const(CartesianIndices((-1:1, -1:1, -1:1))))::Core.Const(27)
│   %8  = (%7 ÷ 2)::Core.Const(13)
│   %9  = (%6:%8)::Core.Const(1:13)
│   %10 = Base.getindex(%5, %9)::Vector{CartesianIndex{3}}
└──       return %10

Now we see that Julia can figure out that if the input is of type Val{3} then the output will be of type Vector{CartesianIndex{3}}.

That’s better, but I am noticing that we have allocated a whole Vector.

julia> @time neighbors(Val(3))
  0.000008 seconds (1 allocation: 368 bytes)
13-element Vector{CartesianIndex{3}}:
 CartesianIndex(-1, -1, -1)
 CartesianIndex(0, -1, -1)
 CartesianIndex(1, -1, -1)
 CartesianIndex(-1, 0, -1)
 CartesianIndex(0, 0, -1)
 CartesianIndex(1, 0, -1)
 CartesianIndex(-1, 1, -1)
 CartesianIndex(0, 1, -1)
 CartesianIndex(1, 1, -1)
 CartesianIndex(-1, -1, 0)
 CartesianIndex(0, -1, 0)
 CartesianIndex(1, -1, 0)
 CartesianIndex(-1, 0, 0)

Instead we could just create a view of the CartesianIndices.

julia> function neighbors(v::Val{d}) where d
           n_ = CartesianIndices(ntuple(_->-1:1,v))
           return @view n_[begin:end÷2]
       end
neighbors (generic function with 2 methods)

julia> @time neighbors(Val(3))
  0.000003 seconds
13-element view(reshape(::CartesianIndices{3, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}}, 27), 1:13) with eltype CartesianIndex{3}:
 CartesianIndex(-1, -1, -1)
 CartesianIndex(0, -1, -1)
 CartesianIndex(1, -1, -1)
 CartesianIndex(-1, 0, -1)
 CartesianIndex(0, 0, -1)
 CartesianIndex(1, 0, -1)
 CartesianIndex(-1, 1, -1)
 CartesianIndex(0, 1, -1)
 CartesianIndex(1, 1, -1)
 CartesianIndex(-1, -1, 0)
 CartesianIndex(0, -1, 0)
 CartesianIndex(1, -1, 0)
 CartesianIndex(-1, 0, 0)

What the function returns acts the same as the Vector, but no memory is allocated!

julia> cis = neighbors(Val(3));

julia> cis[5]
CartesianIndex(0, 0, -1)

To maintain your original function signature, you can add the following method.

julia> neighbors(d::Int) = neighbors(Val(d))
neighbors (generic function with 2 methods)

julia> @code_warntype neighbors(3)
MethodInstance for neighbors(::Int64)
  from neighbors(d::Int64) @ Main REPL[41]:1
Arguments
  #self#::Core.Const(neighbors)
  d::Int64
Body::Any
1 ─ %1 = Main.Val(d)::Val
│   %2 = Main.neighbors(%1)::Any                                  └──      return %2

We see that type inferrence is still problematic. However something interesting occurs if we create a simple function wirh a known dimension.

julia> f() = neighbors(3)
f (generic function with 2 methods)

julia> @code_warntype f()
MethodInstance for f()
  from f() @ Main REPL[43]:1
Arguments
  #self#::Core.Const(f)
Body::SubArray{CartesianIndex{3}, 1, Base.ReshapedArray{CartesianIndex{3}, 1, CartesianIndices{3, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{UnitRange{Int64}}, false}
1 ─ %1 = Main.neighbors(3)::Core.Const(CartesianIndex{3}[CartesianIndex(-1, -1, -1), CartesianIndex(0, -1, -1), CartesianIndex(1, -1, -1), CartesianIndex(-1, 0, -1), CartesianIndex(0, 0, -1), CartesianIndex(1, 0, -1), CartesianIndex(-1, 1, -1), CartesianIndex(0, 1, -1), CartesianIndex(1, 1, -1), CartesianIndex(-1, -1, 0), CartesianIndex(0, -1, 0), CartesianIndex(1, -1, 0), CartesianIndex(-1, 0, 0)])
└──      return %1

Here we have taken advantage of constant propagation. Julia figured out that 3 is a constant and was able to do type inferrence.

Topic		Replies	Views
Runtime dispatch when broadcasting a view indexed with CartesianIndex array Performance performance , broadcast , array , views	1	493	August 5, 2021
Performance issues with CartesianIndices Performance question	5	589	February 24, 2022
Performance of using CartesianIndices with MArray Performance staticarrays , cartesianindices	2	487	March 2, 2022
Avoid runtime dispatch when overloading `isless()` Performance	2	340	March 24, 2022
Performance of Cartesian Indices and Views General Usage	0	384	February 18, 2020

Why does indexing into CartesianIndices lead to excessive runtime dispatch?

Related topics