I want to write a function `g(w, x...)`

with the following properties: If none of the arguments following `w`

is a vector, then the tuple `x`

is appended to `w`

(a vector). For each of the other vector arguments, the function should loop over it, replacing the vector in turn by each of its elements. (This is a toy example.) For instance,

```
julia> w = Vector{NTuple{3,Int}}()
julia> g(w, [1,2],3,[4,5])
4-element Vector{Tuple{Int64, Int64, Int64}}:
(1, 3, 4)
(1, 3, 5)
(2, 3, 4)
(2, 3, 5)
```

Instead of writing a `@generated`

function that produces the right number of loops for all possible inputs, I’m trying to have the compiler generate efficient code via multiple dispatch:

```
f(w, t) = push!(w, t)
f(w, t, x, y...) = f(w, (t..., x), y...)
f(w, t, v::Vector, y...) = for x in v ; f(w, t, x, y...) end
g(w, y...) = begin f(w, (), y...); w end
```

However, this leads to more allocations than a hard-coded version. For computing

```
v = [1,2]
w = Vector{NTuple{3,Int}}()
g(w, v,v,v) # as above
w = Vector{NTuple{3,Int}}()
g0(w, v,v,v) # hard-coded, see below
```

here is the result of using `--track-allocation=user`

:

```
- f(w, t) = push!(w, t)
48 f(w, t, x, y...) = f(w, (t..., x), y...)
336 f(w, t, v::Vector, y...) = for x in v ; f(w, t, x, y...) end
48 g(w, y...) = begin f(w, (), y...); w end
- function g0(w, v1::Vector, v2::Vector, v3::Vector)
0 for x1 in v1, x2 in v2, x3 in v3
208 push!(w, (x1,x2,x3))
0 end
0 w
- end
```

Where do the additional allocations come from and how can they be avoided?