Find index of all occurences of a string


Say I have a string like:


Then I want to find the index of what corresponds to “aaa” so I would expect to find something like:

index = [1:3,7:9,13:15]

I know that “findfirst” exists and I could just make a subfunction utilizing this, just wondered if there was an easier approach.

One option would be to use a regular expression. You can provide a third argument to match() instructing the search to start from a particular position in the string, which you can use to find sequential matches like so:

julia> pattern = r"aaa" # the r"" prefix makes this a regular expression

julia> target = "aaabbbaaabbbaaabbb"

julia> m = match(pattern, target)

julia> m.offset

julia> m = match(pattern, target, m.offset + 1)

julia> m.offset

julia> m = match(pattern, target, m.offset + 1)

julia> m.offset
Thanks, now I can try benchmarking both functions, I will have to make a for loop it seems, if my keyword appears multiple times.

In theory findall("aaa", "aaabbbaaabbbaaabbb") should probably do what you request. That would be consistent with findfirst and friends. Feel free to file a feature request.


Will do so tomorrow, I also wondered why it didn’t.

I think it used to, or there was some similar function that did. There was a major refactor of search and find* functions before the release of 1.0, you’ll probably find something about this in that issue or related PRs.


How about using eachmatch ? (I guess it’s a natural next step from @rdeits’s suggestion)

julia> s  = "aaabbbaaabbbaaabbb"
julia> range(m::RegexMatch) = m.offset .+ (0:length(m.match)-1)
julia> [range(e) for e ∈ eachmatch(r"aaa", s)]

Edit: if done over characters that may not have length one, you’d have to adjust range to something like 0:prevind(m.match, lastindex(m.match))