Concerns about splatting

I’m trying to write some code which involves arrays of arbitrary rank and as it’s extremely difficult to avoid splatting (especially with the way many functions in Base are currently written) I’m getting concerned about splatting.

Roughly speaking, each time my program runs, I expect to use arrays of a consistent rank so that the splatted functions will only be called in a certain way. I’m still concerned however about the compilers ability to determine the input of splatted functions ahead of time.

My question is whether I need to worry about this sort of thing. As a simple example, if I have

getindex(A, idx...)

is this going to be inefficient even if A is of a consistent rank for a single run of the program? My hope is that if idx is a tuple it actually is possible for the compiler to determine which getindex it would have to call and I don’t have to worry.

Some of Base makes this sort of thing really difficult to avoid. I thought it would be nice to deal only with rank-1 indices using Base.ind2sub and Base.sub2ind but the latter actually takes splatted arguments. It seems to me that this can be easily avoided, and one possibility is to simply rewrite them.

1 Like

I’m not sure what’s your concern exactly, but it should be fine in general. Splatting can have a performance impact in some specific cases, but in general it should work as if a tuple was passed, so the number of arguments and their types can be inferred.

Let’s put it this way: here’s one of the simplest examples of things which I’d like to do. I’m not necessarily claiming this is the best way of doing this, but it should serve as a prototype for some other things.

Suppose I want to pass a tuple of integers, do some manipulations on them, and they use that to create an argument for getindex. Let’s say that there are N such integers. If I’m doing some manipulation on them, I’m probably going to get them back as a Vector, let’s call it idx. Is there any inefficiency I have to worry about when doing getindex(A, idx...)? If the length of idx just so happens to always be N, well this be as efficient as if I had done getindex(A, a, b, c,)?

AFAIK that will be equivalent to passing the elements one by one manually. But note that if you extract them from a vector, the compiler won’t know how many elements are passed, which means dynamic dispatch will be used. Better keep them as a tuple or a StaticArray if their number is fixed.

3 Likes

I did a quick experiment, and splatting tuples is indeed just as fast as accessing them directly:

julia> const z = rand(2, 2);

julia> using BenchmarkTools, StaticArrays
f
julia> f(x) = getindex(z, x...)
f (generic function with 1 method)

julia> g(x) = getindex(z, x[1], x[2])
g (generic function with 1 method)

julia> @btime f($((1, 2)))
  1.961 ns (0 allocations: 0 bytes)
0.9306418967436776

julia> @btime g($((1, 2)))
  1.963 ns (0 allocations: 0 bytes)
0.9306418967436776

whereas splatting a vector is substantially slower (also as expected);

julia> @btime f($([1, 2]))
  55.753 ns (1 allocation: 16 bytes)
0.9306418967436776

julia> @btime g($([1, 2]))
  2.249 ns (0 allocations: 0 bytes)
0.9306418967436776

To my surprise, splatting an SVector is extremely slow, and I have no idea why:

julia> @btime f($(SVector(1, 2)))
  458.490 ns (10 allocations: 432 bytes)
0.9306418967436776

julia> @btime g($(SVector(1, 2)))
  1.998 ns (0 allocations: 0 bytes)
0.9306418967436776

Your’e not trying hard enough:

julia> f(x) = getindex(z, x...)
f (generic function with 1 method)

julia> g(x) = getindex(z, x[1], x[2])
g (generic function with 1 method)

julia> h(x) = getindex(z,Tuple(x)...)
h (generic function with 1 method)

julia> @btime f($(SVector(1, 2)))
  752.758 ns (10 allocations: 432 bytes)
0.37259253572165574

julia> @btime g($(SVector(1, 2)))
  2.514 ns (0 allocations: 0 bytes)
0.37259253572165574

julia> @btime h($(SVector(1,2)))
  2.793 ns (0 allocations: 0 bytes)
0.37259253572165574

julia> @code_warntype f(SVector(1, 2))
Variables:
  #self#::#f
  x::SVector{2,Int64}

Body:
  begin 
      return (Core._apply)(Main.getindex, (Core.tuple)(Main.z)::Tuple{Array{Float64,2}}, 
                x::SVector{2,Int64})::Any
  end::Any

julia> @code_warntype h(SVector(1, 2))
Variables:
  #self#::#h
  x::SVector{2,Int64}

Body:
  begin 
      return (Base.arrayref)(Main.z,
               (Core.getfield)((Core.getfield)(x::SVector{2,Int64}, :data)::Tuple{Int64,Int64}, 1)::Int64, 
               (Core.getfield)((Core.getfield)(x::SVector{2,Int64}, :data)::Tuple{Int64,Int64}, 2)::Int64)::Float64
  end::Float64

Pardon the duplication, it gets all the times from one system. Obviously the compiler can optimize the last bit.

Would wrapping idx in Tuple(SVector(idx)) work for @ExpandingMan’s application?

Perhaps @nalimilan can say whether the above means that something is missing from the implementation of StaticArrays. Or maybe Splattability should be a trait, and traits should be built into the language?

2 Likes

Sorry, I’m not familiar with StaticArrays at all.

Lots of useful advice, thanks all. I think the upshot is that, as I originally believed, yes, the compiler has to be told how many elements are in the object being splatted somehow. The most straightforward way to achieve this is with tuples (though this is still difficult because often to create the needed tuple one needs to use splatting somehow) but alternatively one can convert static arrays into tuples.

I think my two best options right now are to either use static arrays or to write some functions using linear (one-index) indices. By the way, it would be really nice if there were a version of Base.sub2ind that two tuples as arguments rather than involving splatting.

There should be a tuple embedded in the type of a static array which should be able to be splatted.

I’ve registered TupleTools.jl a few days ago to gather some tools to manipulate tuples (specifically NTuple{N,Int}) as if they were vectors, but in a way that preserves inferrability of the result (and in particular the length). These were just small routines that I kept on redefining in every package I was creating. The goal is indeed manipulating tuples that correspond to array dimensionalities, strides or indices.

Any useful additions are welcome.

3 Likes