Can `findmin` use a counter if `keys` is not defined for an iterator?

I want the index for which an iterator itr is minimized. I had expected

_, ind = findmin(f, itr)

to return the index, but this doesn’t work if keys(itr) is not defined. What does work is:

ind, _ = argmin(((ind, v),) -> f(v), enumerate(itr))

To take an example:

julia> z = zip([2,4], [6,1]);

julia> findmin(z) do (a,b)
           a + b
       end
ERROR: MethodError: no method matching keys(::Base.Iterators.Zip{Tuple{Vector{Int64}, Vector{Int64}}})

julia> argmin(enumerate(z)) do (ind, (a,b))
           a + b
       end
(2, (4, 1))

I understand that findmin returns the index of a value in the domain, whereas argmin returns the value in the domain for which the argument is minimized. However, this sounds like splitting hairs over what the definition of an index is, and whether an index means anything unless you may use it to index into the domain. What I want here is not really an index, but it may be referred to as a count, as I just want to run a counter and return the corresponding value for which the function is minimized. I’m assuming that even though the domain may not be indexed into, one may iterate over it. The argmin variant achieves exactly this. My question is, can findmin be made to behave analogously, or is it reserved for iterators where the domain supports indexing?

Related: findmin with filtered iterators fail · Issue #47124 · JuliaLang/julia · GitHub

That’s the conceptual difference between findmin and argmin: do you want an index in the domain, or a value of the domain? I don’t think that’s splitting hairs. In your case the domain is not indexable so I think it’s neat that you can still do what you want by using argmin with a purpose-built domain that includes the “count”.

Maybe what’s missing from Base is a simple way to wrap an iterator in a type that defines pairs/keys, for example

julia> Base.pairs(it::Iterators.Enumerate) = Iterators.map(Base.splat(=>), it);

julia> z = zip([2,4], [6,1]);

julia> findmin(enumerate(z)) do (a,b)
           a + b
       end
(5, 2)
1 Like

The following seems to produce the same result faster:

findmin(x -> x[1] + x[2], collect(z))

Does it make any sense?

collect works, but I was hoping that I won’t need to materialize the iterator

Alright, but where does the materialization occur in @sijo’s code, as it is allocating more?

That’s the sort of thing that I had in mind, but I wonder if pairs allows this? The docstring says

Return an iterator over key => value pairs for any collection that maps a set of keys to a set of values.

enumerate doesn’t exactly map keys to values, it allows iterating over both parallelly. In a loose sense, though, there does exist a mapping.

I think the splat is slow, this seems faster than collect:

z = zip([2,4], [6,1])

Base.pairs(it::Iterators.Enumerate) = Iterators.map(it) do (i, el)
   i => el
end

findmin(x -> x[1] + x[2], enumerate(z))
1 Like

I agree it’s not clear cut, but for me the output of enumerate is indeed a mapping (I think the doc here is using plain English, not a formal definition of “mapping” that would require indexing operation to work).

Also it seems hard to argue that Generator can have keys but Enumerate should not…

Brilliant, it is faster and with 0-allocations now :slight_smile:

1 Like

For reference, PR submitted here.

2 Likes

I confess that I have entered an area that I know little about.
But what do you think of the following idea?
I did some tests and it seems to give the expected result in times comparable to the other proposals.
But I’m not sure if that’s a generally valid solution, beyond the example tested.
Any comments are welcome in order to better understand how things are going.

Base.keys(it::Iterators.Zip) = Iterators.map( identity, 1:length(it))

# Base.keys(it::Iterators.Zip) = Iterators.map( identity, size(it))
1 Like

I guess keys don’t make sense for a Zip, as there isn’t a direct key-value relationship. It makes more sense for an Enumerate, although even there the key is a counter, and not something that you may use to index into the parent iterator

2 Likes

Yes I think keys make more sense for Enumerate as the enumeration assigns a kind of ID to each value. And it makes sense that you can wrap any iterator with enumerate to gain this functionality. Otherwise you would have to define methods for all iterators. Or to also cover user iterators you would need to define a generic fallback Base.keys(it), which was turned down here for good reasons I think.

Also for the problem in this thread, defining only keys wouldn’t work for iterators that have no length defined, while enumerating the pairs always works.