"Simpler" findfirst/firstall methods?

I wonder if there is a reason (computational, to avoid name collision…) while there aren’t “simpler” findfirst/firstall methods in Base:

a = ["aa","cc","cc","bb","cc","aa"]
findfirst(x -> x == "cc",a) # ok, but user needs to know/understand anonymous functions
findall(x -> x == "cc",a)   # ok, but user needs to know/understand anonymous functions
findfirst("cc",a)           # Seems "natural" but raises MethodError
findall("cc",a)             # Seems "natural" but raises MethodError
# But for UInt8 only we have it even for searching patterns:
findfirst([0x52, 0x62], [0x40, 0x52, 0x62, 0x63]) # 2:3
findfirst([0x52], [0x40, 0x52, 0x62, 0x63])       # 2:2

Adding the methods for the above calls is one-line task:

import Base.findfirst, Base.findall
findfirst(el::T,cont::Array{T}) where {T} = findfirst(x -> x == el,cont)
findall(el::T,cont::Array{T}) where {T} = findall(x -> x == el,cont)
findfirst("cc",a) # 2
findall("cc",a)   # [2, 3, 5]

You already have the curried versions of == etc., e.g.,
findfirst(==("cc"),a)

If you allow sending in the eltype as the first argument, there can be ambiguities for cases like

struct Foo <: Function
    a
end

Base.:(==)(a::Foo, b::Foo) = a.a == b.a
(a::Foo)(x) = 2a.a == x.a

vec = [Foo(1), Foo(2)]
findfirst(==(Foo(1)), vec)
findfirst(Foo(1), vec)

should you then

  1. Call the Foo(1), in which case the answer is 2
  2. Compare with Foo(1), in which case the answer is 1
julia> findfirst(==(Foo(1)), vec)
1

julia> findfirst(Foo(1), vec)
2

findfirst(==(Foo(1)), vec) makes this clear

7 Likes

I think this is a reasonable requirement since they are part of Julia and used pervasively.

I am not aware of a simpler general idiom than anonymous functions / closures to specify a predicate, which findfirst etc needs.

3 Likes

I understand that implementing specific cases turns out to make the language somewhat ugly, but probably the two most common uses would be solved with methods like:

findall(el::T,cont::Array{T}) where {T<:Number} = findall(isequal(el),cont)
findall(el::T,cont::Array{T}) where {T<:AbstractString} = findall(isequal(el),cont)

I am not completely sure if defining that would not be nice to avoid new users getting frustrated. It is a very common pattern (many people, myself included, tried that).

1 Like

Actually, the case of baggepinnen is pretty corner case, and the compiler actually would tell you about the ambiguity… so one could handle a specific version for T<:Function and then leave the rest for the “normal” “look for this element” case…

I understand that in Julia anonymous functions are everywhere, but for most of the persons in my circle they would be quite advanced topic… still they would benefit from using Julia :slight_smile: :slight_smile: :slight_smile:

EDIT: also the CartesianIndex is not the simplest concept to grasp… one could reserve it for who need it, compared to just returning the indices:

findfirst(el::T,cont::Array{T};returnTuple=true) where {T} = ndims(cont) > 1 && returnTuple ? Tuple(findfirst(x -> isequal(x,el),cont)) : findfirst(x -> isequal(x,el),cont)
findall(el::T, cont::Array{T};returnTuple=true) where {T} = ndims(cont) > 1 && returnTuple ? Tuple.(findall(x -> isequal(x,el),cont)) : findall(x -> isequal(x,el),cont)

There are complications to generalize this. What if the array is of type Any (which is an important corner case, one might argue that newbies that don’t know what anonymous functions are can well fall into this case by chance). The first argument should be considered a function to be applied or to be searched as an element?

I see, so the problem is that there is already one method where the first parameter is a function and hence this could let to ambiguities, and hence the only possibility would be to implement the “simple” version of findfirst/findall with strict/specific signatures… thank you

Funny, I was thinking about the search functions today, but my annoyance with them is other:

  1. I am not aware of any “unsafe” versions. Sometimes I just want to get a value matching a predicate from a collection where I am 100% sure there is such value (and if there is not, I am completely ok with an exception being thrown). I do not want to get an index to re-index the collection.
  2. Of course, these “simpler” variants could clash with the current behaviour of returning nothing when a value is not found, as nothing can be the value we are searching for. But even in the current case, when using findfirst on a Dict the key of the searched value may be nothing (and, therefore, there is ambiguity).
  3. I am also not aware of versions that return both key and value, what can be relevant when indexing again is costly (e.g., Dicts and other more complex structures) and you are in a tight loop.

The plain English versions are user friendly:

a = ["aa","cc","cc","bb","cc","aa"]
findfirst(isequal("cc"), a)
findall(isequal("cc"), a)
2 Likes

Actually I think that I remember my feelings when seen this the first time. First, it was: findfirst('a',"abc") returns an error? Ouch!

Second it was: findfirst(isequal('a'),"abc")? That is horribly redundant.

Third was: oh, wait, I can pass any function with the notation x -> f(x)? That is awesome! And the whole power of anonymous functions in Julia was unleashed.

So, I think what we should do, when someone asks why that does not work, is to say:

You can find by any function using this notation: findall(x->f(x),v). This is very powerful, you see?

And that power comes with the slight inconvenience of having to write findall(isequal(x),v) if you just want to find the elements of v equal to x.

Talking about small annoyances, anyone knows why is this?

julia> findfirst(isequal("a"),"abca") == nothing
true

I mean, if there is any deep reason not to return the index of the first appearance of the substring.

And then for strings, the simpler syntax works fine… :

julia> findfirst("a","abca")
1:1
julia> findall("a","abca")
2-element Vector{UnitRange{Int64}}:
 1:1
 4:4
1 Like

That’s because iteration over string yields characters, not string, and "a" is not equal to 'a'.

See `findfirst` on dictionaries returns ambiguous result · Issue #29565 · JuliaLang/julia · GitHub. That’s unfortunate, but it’s more convenient in most situations (notably when the collection is an array or a string). We just need a variant so that the returned key is wrapped in Some.

2 Likes

Well, that might be an explanation, but the behavior does not seem consistent to me:

julia> findfirst("a","abc")
1:1

julia> findall("a","abc")
1-element Array{UnitRange{Int64},1}:
 1:1

julia> findall(isequal("a"),"abc")
Int64[]

julia> findfirst(isequal("a"),"abc")
#nothing 

It just seems that someone decided that findfirst and findall for some reason should have the OP alternatives of being called with the first element not being a function. Probably because that is easy in the sense that the second argument is not a collection, and that makes things easier. Still it is one specialized implementation for a given type of search.

2 Likes

first(Iterators.filter(f, c))?

I think doing that manually with pairs is not too bad

first(g(k,v) for (k,v) in pairs(d) if f(k,v))

(if it’s an AbstractDict you don’t even need the pairs call)

These are good alternatives. I would prefer that they were part of the find* API instead of one-line workarounds but it suffices for most cases.

Yes, these special methods for strings were added recently, as they are not ambiguous (strings are not supposed to be functions) and it’s a common need. But there’s no inconsistency as nothing implies that findfirst(x, y) should be equivalent to findfirst(isequal(x), y).

That goes in favour of the suggestion of the OP, then, to create convenience functions for the common needs. But I think that there is no ambiguity not because strings are not functions, but because the second argument is not an array, that is what make them different.

Well, nothing is a little bit too strong there :slight_smile:

1 Like