Here f would be given the string "a" in both matches, and I have no indication of which group matched. Splitting replace into multiple runs is not possible, since the context for each match will change.
Currently I am working around this by providing custom method
Base._replace(io::IO, repl_s::_MyT, str, r, re::Base.RegexAndMatchData) = begin
n = Base.PCRE.substring_length_bynumber(re.match_data, 1)
...
but I don’t feel at ease by patching internal methods.
Is there a better way? For reference, other languages would provide full Match object to f in this case.
There might also be something you could do with eachmatch, but it might be some work to plumb that into a replace-like function to actually make the substitution based on what it matched.
# The RegexMatches contain more info than they show here, although some is likely internal.
# Try calling `dump` on one.
julia> eachmatch(r"\b(\w)|(\w)(?=c)", "abac") |> collect
2-element Vector{RegexMatch}:
RegexMatch("a", 1="a", 2=nothing)
RegexMatch("a", 1=nothing, 2="a")
There might be room to add a feature to replace where you could make the function be passed the entire RegexMatch object rather than only the SubString. That seems like it is maybe what it should have been to begin with.
Iterating over eachmatch seems to be equivalent to what I’m doing if I change regex a little to r"\b(\w)|(\w)(?=c)|(.)", I’ll give it a try. Thank you.
If all you needed was a simple library of replacements you can consider something like
julia> wordboundarydict = Dict("a" => "X", "e" => "Y", "o" => "Z"); # replacements after word boundary
julia> beforecdict = Dict("a" => "_", "e" => ":", "o" => "."); # replacements before c
julia> substitute(dict) = key -> get(dict, key, key); # access a key if possible, else returns the key
julia> replace("abac", r"\b(\w)" => substitute(wordboundarydict), r"(\w)(?=c)" => substitute(beforecdict))
"Xb_c"
julia> replace("ebec", r"\b(\w)" => substitute(wordboundarydict), r"(\w)(?=c)" => substitute(beforecdict))
"Yb:c"
julia> replace("oboc", r"\b(\w)" => substitute(wordboundarydict), r"(\w)(?=c)" => substitute(beforecdict))
"Zb.c"
julia> replace("xbec", r"\b(\w)" => substitute(wordboundarydict), r"(\w)(?=c)" => substitute(beforecdict)) # no replacement for x
"xb:c"
But otherwise it sounds like you have a plan you can use based on eachmatch. Good luck!
EDIT: I finally took a look into the Base._replace function you were modifying. Comments around that file suggest that packages might extend some of those functions. This suggests they probably won’t change a lot, although I don’t see anything quite suggesting a guarantee. All-in-all, it looks like your approach there (with your custom types to avoid piracy) was mostly okay if that ends up being the nicest way.