The correct use of findfirst

strings
#1

DOCS> “You can search for the index of a particular character
using the findfirst and findlast functions.”

julia> findfirst(isequal('('), "André(m, n)")
7

julia> findfirst(isequal('('), "Andre(m, n)")
6

It seems clear that this can lead to mistakes.
What’s the way out?

0 Likes

#2

In this case, findfirst is giving you precisely the index you would use to extract that character:

julia> s = "André(m, n)"
"André(m, n)"

julia> s[7]
'(': ASCII/Unicode U+0028 (category Ps: Punctuation, open)

Is your actual goal something other than this? Do you want to count the number of letters that occur before the ( character?

2 Likes

#3

Is your actual goal something other than this?

For example consider:

function extractName(s::String)::String
    i = findfirst(isequal('('), s)
    return s[1:(i-1)]
end

Here I expect

extractName("Andre(m, n)") |> println

to return “Andre”, which is what it does. Similarly I expect

extractName("André(m, n)") |> println

to return “André” but I get

ERROR: LoadError: StringIndexError("André(m, n)", 6)
0 Likes

#4

Have you read through https://docs.julialang.org/en/v1/manual/strings/index.html?

0 Likes

#5

Not all indices into a string are valid, since some indices might not lie exactly at the start of a character (and some characters span multiple bytes). Instead, you can use prevind to extract the previous valid index:

julia> s = "André(m, n)"
"André(m, n)"

julia> i = findfirst(isequal('('), s)
7

julia> s[1:prevind(s, i)]
"André"

Although in this case a regular expression might be more appropriate:

julia> m = match(r"(.*)\(", s)
RegexMatch("André(", 1="André")

julia> m.captures[1]
"André"
6 Likes