Allocations in mapreduce/sum etc

I’m confused by why the following allocates (this is a reduced example that arose from trying to implement a function from #32739 via higher order functions instead of explicit loops):

g(v) = mapreduce(i->v[1]*i, +, eachindex(v))
@btime g($v)

Results in ... (1 allocation: 16 bytes) on v1.1. How can I get rid of that allocation?

A similar thing happens with e.g. sum(f, itr) as soon as the passed function f contains reference to any element of a captured array:

_h(i,v) = i*v[i]
h(v) = sum(i->_h(i,v), eachindex(v))

also produces ... (1 allocation: 16 bytes).

I guess that suggests that the allocations are not really due to mapreduce, foldl, sum, or similar, but to the anonymous function itself? Still, I don’t really get why this happens.

[This was tested on julia-v1.1.1, btw.]

For anyone who might stumble upon this in the future:

From some digging, this seems to basically be a consequence of two elements:

  1. #18207 (interplay of function arguments with recursive functions) and the recursive elements of mapreduce, and
  2. an allocating throw(ArgumentError(...)) call that handles empty-iterables inputs for non-specialized function input (oddly, this allocation can be removed by adding @inline to mymapreduce_empty(f,op,T), contrary to usual @noinline-strategy for avoiding spurious throw allocations).

Eliminating both of these elements removes the allocation noted above; but getting rid of recursion essentially butchers the method.

[It seems that recursion and zero allocations can co-exist, at least in principle, if specialization on the function input is forced via mapreduce_impl(f::F, ...) where F<:Function as suggested in #18207(comment); that’s probably not very viable for such a generic function thoughScratch that; doesn’t actually seem to work for me.]

I actually think that the root cause here is the fact that a closure over a non-isbits value (v) is created and subsequently passed to a non-inlined function (mapreduce or one of the functions it calls with the closure as an argument, recursively). The compiler essentially treats the function as a black box (or at least more so than if the function were inlined), and can’t figure out that the closure is short-lived; for all it knows, mapreduce could store a reference to the closure somewhere in a global array of objects. This means that the closure needs to be GC-rooted, which requires allocation.