Why does defining a vararg method define a zero argument method?

The beauty of f(x, xs...) is in not needing special syntax for this.

When it comes to syntax, less is more.

11 Likes

Itā€™s probably more of an issue when you have a substantial type signature, like:

f(x::SomeType{A}, xs::SomeType{A}...)

Then maybe this is better

f(x::T, xs::T...) where {T<:SomeType{A}}
2 Likes

Hi, I have two questions.
In this case

  1. What is the best way to pass a collection?
  2. What is the best way to recover it inside the function?

Here and example

using BenchmarkTools

fun1(x, xs...) = (x, xs...)
fun2(xs...) = xs

xs = rand(Int(1e3))
x0, x1 = xs[1], xs[2:end]
@btime fun1($x0, $x1...) 
# 70.376 Ī¼s (2008 allocations: 78.72 KiB)
@btime fun2($xs...);
# 35.817 Ī¼s (1003 allocations: 39.38 KiB)
1 Like

I would also add that I cannot think of a case where I have to write this and then put the first element back into a tuple with the rest. You almost always have to recurse so that having the first element is useful or you should be defining the single element method explicitly.

3 Likes

As an argument, plain and simple. Splatting it is not idiomatic and incurs a significant compilation and performance penalty.

2 Likes

For

fn(x, xs...) = (x, xs...)

you donā€™t need to split a collection into the first element and the rest, Julia does that automatically.
In your example, calling fun1(xs...) would work just fine (except for extra allocations).

Got it!

Nice, I didnā€™t know

Thanks

Exactly this situation came up to make small inert polynomials by summing terms. The intention is that Terms get collected now, to be iterated later, and the consumer might choose in which order to deal with them.

struct Term
    coefficient::Int64
    degree::Int64
end
struct PolyTerm{N}
    pt::NTuple{N,Term}
end
import Base: +
+(s::Term, t::Term...) = PolyTerm((s,t...))

I really want to write +(pt::Term...)=PolyTerm(pt), to avoid splitting and repacking my Tuple, but then Iā€™ve accidentally gone and pirated +() to create a PolyTerm{0} instead of the MethodError people may have been counting on.

4 Likes

Why? I did not check, but ideally it should compile to the same code.

They donā€™t, at least not at the @code_llvm level (is there another optimization step after that? either way, thereā€™s either a difference in run-time for compiled code or compile time between the two versions). Edited to add a MWE

struct Goose i::Int64 ; j::Int64 end
struct Flock{N} g::NTuple{N,Goose} end

Waddle(g::Goose...) = Flock(g)
Swim(g::Goose,moregeese::Goose...) = Flock((g,moregeese...))

@code_llvm Waddle(Goose(1,2), Goose(2,3), Goose(3,4))
@code_llvm Swim(Goose(1,2), Goose(2,3), Goose(3,4))

gives (sorry, not sure how to nicely format output)

;  @ REPL[3]:1 within `Waddle'
define void @julia_Waddle_170([1 x [3 x [2 x i64]]]* noalias nocapture sret, [2 x i64]* nocapture nonnull readonly dereferenceable(16), [2 x i64]* nocapture nonnull readonly dereferenceable(16), [2 x i64]* nocapture nonnull readonly dereferenceable(16)) {
top:
  %4 = alloca [3 x [2 x i64]], align 8
  %5 = bitcast [3 x [2 x i64]]* %4 to i8*
  %6 = bitcast [2 x i64]* %1 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %5, i8* nonnull align 1 %6, i64 16, i1 false)
  %7 = getelementptr inbounds [3 x [2 x i64]], [3 x [2 x i64]]* %4, i64 0, i64 1
  %8 = bitcast [2 x i64]* %7 to i8*
  %9 = bitcast [2 x i64]* %2 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %8, i8* nonnull align 1 %9, i64 16, i1 false)
  %10 = getelementptr inbounds [3 x [2 x i64]], [3 x [2 x i64]]* %4, i64 0, i64 2
  %11 = bitcast [2 x i64]* %10 to i8*
  %12 = bitcast [2 x i64]* %3 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %11, i8* nonnull align 1 %12, i64 16, i1 false)
  %13 = bitcast [1 x [3 x [2 x i64]]]* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %13, i8* nonnull align 8 %5, i64 48, i1 false)
  ret void
}

vs

;  @ REPL[4]:1 within `Swim'
define void @julia_Swim_171([1 x [3 x [2 x i64]]]* noalias nocapture sret, [2 x i64]* nocapture nonnull readonly dereferenceable(16), [2 x i64]* nocapture nonnull readonly dereferenceable(16), [2 x i64]* nocapture nonnull readonly dereferenceable(16)) {
top:
  %4 = alloca [2 x [2 x i64]], align 8
  %5 = alloca [3 x [2 x i64]], align 8
  %moregeese = alloca [2 x [2 x i64]], align 8
  %6 = bitcast [2 x [2 x i64]]* %4 to i8*
  %7 = bitcast [2 x i64]* %2 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %6, i8* nonnull align 1 %7, i64 16, i1 false)
  %8 = getelementptr inbounds [2 x [2 x i64]], [2 x [2 x i64]]* %4, i64 0, i64 1
  %9 = bitcast [2 x i64]* %8 to i8*
  %10 = bitcast [2 x i64]* %3 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %9, i8* nonnull align 1 %10, i64 16, i1 false)
  %11 = bitcast [2 x [2 x i64]]* %moregeese to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %11, i8* nonnull align 8 %6, i64 32, i1 false)
  %12 = getelementptr inbounds [2 x [2 x i64]], [2 x [2 x i64]]* %moregeese, i64 0, i64 1
  %13 = bitcast [3 x [2 x i64]]* %5 to i8*
  %14 = bitcast [2 x i64]* %1 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %13, i8* nonnull align 1 %14, i64 16, i1 false)
  %15 = getelementptr inbounds [3 x [2 x i64]], [3 x [2 x i64]]* %5, i64 0, i64 1
  %16 = bitcast [2 x i64]* %15 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %16, i8* nonnull align 8 %6, i64 16, i1 false)
  %17 = getelementptr inbounds [3 x [2 x i64]], [3 x [2 x i64]]* %5, i64 0, i64 2
  %18 = bitcast [2 x i64]* %17 to i8*
  %19 = bitcast [2 x i64]* %12 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %18, i8* nonnull align 8 %19, i64 16, i1 false)
  %20 = bitcast [1 x [3 x [2 x i64]]]* %0 to i8*
; ā”Œ @ REPL[2]:1 within `Flock' @ REPL[2]:1
   call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %20, i8* nonnull align 8 %13, i64 48, i1 false)
; ā””
  ret void
}

The distinction disappears if Goose has a single field.

Thanks, this is interesting. @btime reports no performance difference, so I am not sure if this is an issue that needs to (can be) addressed.

My point was that in theory, I donā€™t see a compelling reason why the two forms should be effectively different.

1 Like

Thatā€™s a good example. As you observe, it only needs the ā€œat least oneā€ property in order to avoid piracy and it happens to want to create a tuple internally, so the ā€œat least one splatā€ would be really convenient here. If you were defining a function that you ā€œownedā€ then the zero args method would be fine, e.g.:

PolyTerm(args::Term...) = PolyTerm{length(args)}(args)
julia> PolyTerm()
PolyTerm{0}(())

julia> PolyTerm(Term(2,3))
PolyTerm{1}((Term(2, 3),))
2 Likes

Seems like a good spot to mention an issue I was looking into.

In some situations, a splatted/slurped tuple leads to extra allocations over a tuple passed directly, when the tuple is used within an unoptimized throw block.

Extra allocations associated with unused ArgumentError Ā· Issue #37639 Ā· JuliaLang/julia (github.com)

Itā€™s a bit maddening because I canā€™t see any obvious place in the LLVM where the additional allocation is occurring. And yet, all I need to do to eliminate the extra allocation is either 1) pass a complete tuple directly to the function or 2) ensure the slurped tuple is not used within the unoptimized block.

So maybe itā€™s rare, but sometimes they do compile differently.