Runtime dispatch when broadcasting a view indexed with CartesianIndex array

Playing with JETTest.jl and I realize that there exists a runtime dispatch when view, broadcasting, and CartesianIndex array(not CartesianIndices) are used together. But haven’t observed any suspicious performance difference from BenchmarkTools.

function foo(X, inds, Y)
    view(X, inds) .+= Y
end
X = collect(1:9)
inds = collect(LinearIndices(X)) # Matrix{Int}
Rinds = collect(CartesianIndices(X)) # Matrix{CartesianIndex{1}}
Y = collect(inds)

@report_dispatch foo(X, inds, Y) # no errors
@report_dispatch foo(X, Rinds, Y) # one runtime dispatch

@btime foo($X, $inds, $Y);
# 1.7.0-beta3: 23.129 ns (0 allocations: 0 bytes)
# 1.6.2: 25.331 ns (0 allocations: 0 bytes)

# why would this be 0 allocations when there exists runtime dispatch?
@btime foo($X, $Rinds, $Y);
# 1.7.0-beta3: 23.865 ns (0 allocations: 0 bytes)
# 1.6.2: 25.538 ns (0 allocations: 0 bytes)

I thought runtime dispatch would trigger some allocations in benchmark results, but this time I didn’t see it.

As a comparison, I re-test this with LinearIndices and CartesianIndices and become surprised with the performance gap here…

X = collect(1:9)
inds = LinearIndices(X)
Rinds = CartesianIndices(X)
Y = collect(inds)

@report_dispatch foo(X, inds, Y) # no errors
@report_dispatch foo(X, Rinds, Y) # no errors

@btime foo($X, $inds, $Y);
# 1.7.0-beta3: 27.935 ns (0 allocations: 0 bytes)
# 1.6.2:       29.897 ns (0 allocations: 0 bytes)
@btime foo($X, $Rinds, $Y);
# 1.7.0-beta3: 14.933 ns (0 allocations: 0 bytes)
# 1.6.2:       14.716 ns (0 allocations: 0 bytes)

Any ideas on this result or where should I investigate from?


The runtime dispatch error that JETTest found is:

julia> @report_dispatch foo(X, Rinds, Y) # one runtime dispatch
═════ 1 possible error found ═════
┌ @ REPL[34]:2 Base.materialize!(Main.view(X, inds), Base.broadcasted(Main.+, Main.view(X, inds), Y))
│┌ @ broadcast.jl:894 Base.Broadcast.copyto!(dest, Base.Broadcast.instantiate(Core.apply_type(Base.Broadcast.Broadcasted, _)(Base.getproperty(bc, :f), Base.getproperty(bc, :args), Base.Broadcast.axes(dest))))
││┌ @ broadcast.jl:980 Base.Broadcast.preprocess(dest, bc)
│││┌ @ broadcast.jl:966 Base.Broadcast.preprocess(dest, Base.getindex(args, 1))
││││┌ @ broadcast.jl:957 Base.Broadcast.unalias(dest, src)
│││││┌ @ subarray.jl:111 Base.copyto!(dest, V)
││││││┌ @ abstractarray.jl:1349 Base.unaliascopy(A)
│││││││┌ @ subarray.jl:112 Base.map(Base._trimmedindex, Base.getproperty(V, :indices))
││││││││┌ @ tuple.jl:213 f(Base.getindex(t, 1))
│││││││││┌ @ subarray.jl:117 Base.oftype(i, Base.reshape(Base.eachindex(Base.IndexLinear(), i), Base.axes(i)))
││││││││││┌ @ essentials.jl:375 Base.convert(Base.typeof(x), y)
│││││││││││┌ @ array.jl:532 _(a)
││││││││││││┌ @ array.jl:540 Base.copyto_axcheck!(Core.apply_type(Base.Array, _, _)(Base.undef, Base.size(x)), x)
│││││││││││││┌ @ abstractarray.jl:1056 Base.copyto!(dest, src)
││││││││││││││┌ @ abstractarray.jl:950 Base.copyto_unaliased!(Base.IndexStyle(dest), dest, Base.IndexStyle(src′), src′)
│││││││││││││││┌ @ abstractarray.jl:970 Base.setindex!(dest, Base.getindex(src, i), Base.+(i, Δi))
││││││││││││││││┌ @ array.jl:839 Base.convert(_, x)
│││││││││││││││││ runtime dispatch detected: Base.convert(_::Type{CartesianIndex{1}}, x::Int64)
││││││││││││││││└────────────────
(Core.PartialStruct(SubArray{Int64, 2, Vector{Int64}, Tuple{Matrix{CartesianIndex{1}}}, false}, Any[Vector{Int64}, Tuple{Matrix{CartesianIndex{1}}}, Core.Const(0), Core.Const(0)]), 1)

This code is not actually being executed, is it?

julia> convert(CartesianIndex{1}, 1)
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type CartesianIndex{1}