The new output of findfirst is dangerous

joa-quim · February 17, 2018, 5:03pm

Because findfirst can output a boolean a notthing or an iterator and in this later case, an 0:-1 when it finds nothing.
See this example

julia> first(findfirst(fieldnames(GMT.GMTgrid), :z))
┌ Warning: `findfirst(A, v)` is deprecated, use `findfirst(equalto(v), A)` instead.
│   caller = top-level scope
└ @ Core :0
13

julia> first(findfirst(equalto(:z), fieldnames(GMT.GMTgrid)))
13

julia> first(findfirst(equalto(:u), fieldnames(GMT.GMTgrid)))
ERROR: MethodError: no method matching start(::Nothing)
Closest candidates are:
  start(::SimpleVector) at essentials.jl:550
  start(::Base.MethodList) at reflection.jl:659
  start(::ExponentialBackOff) at error.jl:171
  ...
Stacktrace:
 [1] first(::Nothing) at .\abstractarray.jl:198
 [2] top-level scope

yurivish · February 17, 2018, 5:40pm

I admit to being somewhat confused by this API as well. It seems to me that using nothing as a null value without explicit unwrapping makes it a bit too easy to write code that fails in surprising ways when the element is not found.

Explicit unwrapping with Some(...) would ensure that situation is always handled, and future syntax sugar could reduce the verbosity of cases where you know that the element you’re searching for is definitely there.

The impression I got when I asked about this on the related GitHub issue was that this was an intentional design choice, and that the alternative (requiring explicit unwrapping) was considered too verbose in the common cases.

Nosferican · February 17, 2018, 6:28pm

I was used to having it returned 0 and then managing the case. A bit taken by the decision to change this in 0.7.

yuyichao · February 17, 2018, 6:31pm

Why would you ever want to call first on a know scalar? And no it’s not dangerous at all. You’ve demonstrated how useful it is to catch user error.

joa-quim · February 17, 2018, 6:43pm

No, what I have demonstrated is how EASY is to fall in non-user error.
And I called first() so that I can still do a test if output != 0, which is now also more or less useless because of the nothing

kristoffer.carlsson · February 17, 2018, 6:44pm

It is completely valid for an AbstractArray to have 0 as a valid index. Then what do you do?

yuyichao · February 17, 2018, 6:49pm

That is the user error. Why do you want to do such a useless test?

Nosferican · February 17, 2018, 6:56pm

I wasn’t aware AbstractArray supported non \mathbb{N} numbers. I believed it was only possible to index arrays from 1:size(obj, dim). If that is the case, I don’t think 0 was a good previous behavior. Wouldn’t suggest negative index since that might represent (end - index) in future versions. In that case, nothing seems appropriate and just dispatch on either an Integer or nothing.

joa-quim · February 17, 2018, 6:56pm

julia> findfirst(“abcd”,“ef”) == 0
false

julia> first(findfirst(“abcd”,“ef”)) == 0
true

Though I see that I can use isempty() for this case. But not for

isempty(nothing)
ERROR: MethodError: no method matching start(::Nothing)
Closest candidates are:
...

yurivish · February 17, 2018, 7:00pm

For a bit of additional texture on the use cases enabled by more flexible indexing I recommend having a look through the JuliaArrays org: https://github.com/JuliaArrays.

Two particularly good examples of nontraditional but highly useful indexing are AxisArrays.jl and OffsetArrays.jl.

yuyichao · February 17, 2018, 7:02pm

That’s not even the same method. It’s not searching for an element in a collection at all so why do you expect it to have exactly the same behavior as searching an array?

piever · February 17, 2018, 7:49pm

I don’t know much about the findfirst usecase for strings as I very rarely do string processing but concerning the element in a collection method, I find the new behavior much better. Returning 0 is not very principled and could lead to issues (for example, if I’m using OffsetArrays, did I find 0 or did I not find anything at all?).

Returning nothing has several advantages:

There can be no confusion as to whether a match was found or not
One can dispatch on the returned type to decide what to do next
If desired, it’s easy to replace it with some other value with coalesce:

coalesce(findfirst(equalto(el), v), 0)

To simplify checking for nothingness, I wonder whether it’d be helpful to have isnothing (analogous to ismissing), but it’s also true that x isa Nothing isn’t much longer.

kristoffer.carlsson · February 17, 2018, 7:51pm

I would also like an isnothing(x) = x === nothing in base. It looks silly but I think it would be useful.

mbauman · February 17, 2018, 9:46pm

findfirst will always just return the index of the first item it finds. If it doesn’t find anything, it returns nothing. It just so happens that vectors use integers as their indices, and numbers are iterable pseudo-collections of themselves.

Personally, I find this a lot less dangerous than the previous behavior. Note that returning 0 instead of a valid index doesn’t require any explicit error checking, either, but it might actually work in your computation and lead to silently wrong answers.

nalimilan · February 18, 2018, 11:15am

2 posts were split to a new topic: Findfirst() with Dict for which nothing is a valid key

nalimilan · February 18, 2018, 11:26am

There appears to be some confusion here about how the findfirst methods behave. The standard findfirst method operates over elements in a collection and returns the index/key of the first matching element, or nothing if no element matches.

Then, there’s the special case of looking for a substring in a string, e.g. findfirst("ab", "abc") or findfirst(r"ab", "abc"). This method is very different from the general one, since it looks for a sequence of elements (here, characters) in a collection (here, a string). So it returns the range corresponding to the first matching sequence, rather than a single index (indeed, in the case of a regular expression, you don’t know in advance the length of the matching substring so that’s an additional information). If there’s no match, an empty range is returned. We could imagine returning nothing instead for consistency.

At any rate, I don’t understand why the OP wants to call first on the result of findfirst(equalto(:z), fieldnames(x)), given that this function returns a scalar.

joa-quim · February 18, 2018, 2:08pm

Agree, no sense here but I’m the middle of 0.6 to 0.7 transition and in order that same code for strings works I have to call first(findfirst("abcd","ef")) to get a scalar that I can compare to 0.

first(findfirst("abcd","ef")) == 0

piever · February 18, 2018, 2:24pm

If you need this pattern a lot (getting 0 if no element is found/no substring is found, otherwise the element index - for arrays - and the first index of the substring) you may define your own helper function:

julia> myfirst(n) = first(n)
myfirst (generic function with 1 method)

julia> myfirst(::Nothing) = 0
myfirst (generic function with 2 methods)

julia> map(myfirst, (0:-1, 2:3, 4, nothing))
(0, 2, 4, 0)

In principles it does seem a bit more consistent for findfirst to return nothing rather than the empty range in the substring case, but maybe there are advantages in always returning a range that I’m not seeing.

ScottPJones · February 18, 2018, 3:07pm

What I’m not sure of, is how much the compiler would be able to do with this, to get back to the performance it used to have, since it has a dynamic dispatch.

nalimilan · February 18, 2018, 4:00pm

I still don’t get it. On Julia 0.6 findfirst("ef","ef") returns 0 since it’s looking for a character equal to "ef". So that doesn’t sound very useful and it’s completely different from what happens on 0.7 (disregarding the question of nothing).

Anyway, use Compat.findfirst if you want to support both Julia 0.6 and 0.7 using the 0.7 API.

Topic		Replies	Views
Suggestion for more general/performant sentinel for find* functions Internals & Design	17	1640	January 29, 2018
Findfirst for Dicts with `nothing` keys General Usage	33	3409	March 4, 2018
New findfirst behaviour - type unstable or no? General Usage	5	1182	March 1, 2018
"Simpler" findfirst/firstall methods? Internals & Design	16	831	March 1, 2021
Performance of findfirst in 0.7 Performance	10	1515	April 4, 2018

The new output of findfirst is dangerous

Related topics