Broadcasting `setindex!` over a tuple of arrays with splatted indices is slow

wsshin · March 11, 2018, 6:11am

For demonstration of this problem, create a tuple of two matrices, one with Float64 and the other with Int64 as entries:

julia> VERSION
v"0.6.3-pre.0"

julia> mats = (rand(3,3), rand(Int64,3,3));

Also create a tuple of Float64 and Int64, which we will set as the (1,1) entries of the above-created two matrices:

julia> vals = (0.0, 0);

Now, let’s set the entries by broadcasting setindex! over the tuples. The performance is pretty good with only one allocation:

julia> using BenchmarkTools

julia> @btime setindex!.($mats, $vals, 1, 1);
  10.381 ns (1 allocation: 32 bytes)

However, if if I pass the indices as a splatted tuple, suddenly the performance degrades significantly with 10 allocations:

julia> @btime setindex!.($mats, $vals, (1,1)...);
  394.860 ns (10 allocations: 288 bytes)

Why is this happening?

Note 1. This performance degradation does not happen if mats is a tuple of matrices of the same eltype:

julia> mats = (rand(3,3), rand(3,3));

julia> vals = (0.0, 0.0);

julia> @btime setindex!.($mats, $vals, (1,1)...);
  10.921 ns (1 allocation: 32 bytes)

Because the performance degradation happens when mats is a tuple of inhomogeneous types, I guess this problem has the same origin as this issue. However, then I’m not sure why the splat matters here.

Note 2. The situation is not much different in Julia 0.7.

mbauman · March 11, 2018, 7:02pm

This is because the base broadcast implementation for combinations of heterogeneous tuples and scalars is type-unstable:

julia> @code_warntype broadcast(+, (1.,1), 1)
…
  end::Tuple{Union{Float64, Int64},Union{Float64, Int64}}

The implementation with two tuples of the same length is easier — that’s just map which has a carefully constructed implementation to remain type-stable:

julia> @code_warntype broadcast(+, (1.,1), (1,1))
…
  end::Tuple{Float64,Int64}

It’s currently hard to iteratively construct tuples of heterogenous types in a way that inference can follow.

Here’s how I got here: Often in debugging these sorts of things I find it helpful to use little function wrappers. Sometimes BenchmarkTools is interacting with a global scope in a way that I don’t expect. That’s not the case here, but they’re still helpful in seeing why they are different:

julia> f(mats, vals) = setindex!.(mats, vals, 1, 1)
       g(mats, vals) = setindex!.(mats, vals, (1,1)...)
g (generic function with 1 method)

julia> @btime f($mats, $vals);
  7.185 ns (1 allocation: 32 bytes)

julia> @btime g($mats, $vals);
  293.312 ns (6 allocations: 176 bytes)

So now you can also do @code_warntype on these guys to see that g(mats,vals)::Tuple{Union{…},Union{…}} while f is a type-stable Tuple{Array{Float64,2},Array{Int64,2}}.

The splatting is a red herring: the inference is different not because of the splatting, but because setindex!.(a, b, 1, 1) actually lowers to broadcast((a,b)->setindex!(a, b, 1, 1), a, b) — the numeric literals become a part of the function! Try:

julia> h(mats, vals, x, y) = setindex!.(mats, vals, x, y)
h (generic function with 1 method)

julia> @btime h($mats, $vals, 1, 1);
  298.037 ns (6 allocations: 176 bytes)

So now the difference isn’t in splatting, but rather it’s which arguments effectively get passed to broadcast.

Topic		Replies	Views
Splatting of an integer General Usage question	4	447	December 21, 2020
Is it worth trying to speed up setindex for arrays of numbers where an array is overwritten with another? General Usage performance , array	0	605	June 10, 2021
Unexpected allocations in looped vs broadcasted functions on tuples of arrays Performance	4	512	March 12, 2020
Broadcasting setindex! is a noobtrap New to Julia broadcast	2	523	February 15, 2023
Indexing multidimensional arrays of arbitrary dimension Performance indexing , array	12	1558	July 19, 2019

Broadcasting `setindex!` over a tuple of arrays with splatted indices is slow

Related topics