Cannot find overloaded method

At least for interactive usage it is often convenient to have a replace method for strings that can perform multiple replacements at once. So I try do define it, but julia (1.5) doesn’t find the new method even though it gets shown in the MethodError:

julia> function Base.replace(str::AbstractString, old_new::Pair...)
           for o_n in old_new
               str = replace(str, o_n)
           end
           return str
       end

julia> replace("abc", "a" => "x", "b" => "y")
ERROR: MethodError: no method matching replace(::String, ::Pair{String,String}, ::Pair{String,String})
Closest candidates are:
...
  replace(::AbstractString, ::Pair...) at REPL[2]:1
...

Even stranger, everything works as expected if I replace AbstractString with String in the method definition.

Any idea what goes wrong? Is it a bug in Julia?

I know this is type piracy, but see nothing wrong in it for interactive usage.

In the “clostst candidates” there is

 replace(::AbstractString, ::Pair, ::Pair) at set.jl:594

So it looks like replace with exactly two pairs and an AbstractString is overriding your vararg defition with AbstractString. Note that your function when you input 3 or more pairs.

The line in set.jl is here.

Fwiw, I think you’ve hit on something weird. replace("abc", "a" => "x", "b" => "y") seems to work on every collection except for AstractString. And the fact that it’s not handled is only due to some method ambiguities.

I would maybe file an issue. There could be some deeper reason for this that I don’t know about though.

That doesn’t actually work:

julia> function Base.replace(str::AbstractString, old_new::Vararg{Pair,N}) where N
         println("hello world")
       end

julia> replace("abc", "a" => "x", "b" => "y")
ERROR: MethodError: no method matching replace(::String, ::Pair{String,String}, ::Pair{String,String})
Closest candidates are:
  replace(::AbstractString, ::Pair, ::Pair) at set.jl:592
  replace(::AbstractString, ::Pair...) where N at REPL[1]:1

which isn’t too surprising since the explicit Vararg{Pair, N} where N is only different in that it allows access to N within the method body. Spelling this as Pair... vs Vararg{Pair, N} where N doesn’t seem to affect the weirdness going on here.

julia> function f(str::AbstractString, old_new::Vararg{Pair,N}) where N
           1
       end
f (generic function with 1 method)

julia> f("abc", "a" => "x", "b" => "y")
1

that’s weird, it doesn’t work for Base.replace.

As a sanity check, using a different parametric type instead of Pair works just fine:

julia> function Base.replace(str::AbstractString, old_new::AbstractVector...)
         println("hello world")
       end

julia> replace("abc", [1,2,3], [4,5,6])
hello world

Notice that the replace method on line 594 of set.jl explicitly throws a MethodError.

replace(a::AbstractString, b::Pair, c::Pair) = throw(MethodError(replace, (a, b, c)))
6 Likes

:man_facepalming:

3 Likes

Hahahahaha. That’s amazing. Ok, yeah, that’s the problem–there is a more specific method, but it throws a MethodError which makes it look like no method exists.

5 Likes

I came across the exact same thing. A method explicitly throwing an error that it doesn’t exists just seems like a bad idea.

1 Like

Well, on one hand, this is terribly confusing and frustrating. On the other hand, what you are doing is type piracy, so it’s probably better to make your very own function anyway.

1 Like

As far as I know, the reason this does not exist is that replacements are order dependent. Consider the following:

julia> str = "ab"
"ab"

julia> replace(str, "a" => "b") |> x -> replace(x, "b" => "a")
"aa"

Since the first invocation returns "bb", the second invocation (rightly so) returns "aa".

Now consider your wish - a replace function that “can perform multiple replacements at once”:

replace(str, "a" => "b", "b" => "a")

What should the result be and is there a generic way to express this?

Should it be "ba", only applying each rule once, seemingly simultaneously? Or should it be “aa”, applying the first rule until exhaustion, then the second rule until exhaustion and so on. There is no generic way to decide this for a user of the function, since there are many different results that are the “correct” result to someone, so it’s hard to prioritise and leave the order to the user instead.

An even more catastrophic example is the following:

replace("aabbaa", "abba" => "bb", "aa" => "bb", "bb" => "aa")

Possible results, depending on the order of the Pairs and how many times each are applied are:

  • "aa" (First pair applied until exhaustion, then second doesn’t match, third applied until exhaustion)
  • "aaaa" (All pairs applied once consecutively)
  • "bbaabb" (Shortest rules are applied first, rules are only applied once)
  • Infinite loop (Pairs are applied one after another in a loop until all pairs are exhausted, which never happens. This happens very easily…)
  • …and many more

or even, this with overlapping rules:

replace("aabb", "aabb" => "yes", "ab" => "no", "no" => "")

Which should fire first and is more important? How often is each rule applied?

This is much too implementation dependant to be offered as a generic function in Base, in my opinion.

1 Like

Yes.

No, there is not such problem. If you wanted the second way you can always do it by subsequent calls of replace in the order which you want. The only new feature possible is to be able to do all replacements simultaneously. This was already brought up in the past. I already had a discussion with @tbeason about this in the past.

The desired implementation is included in the OP:

It is clear that the OP wants this to work as multiple sequential calls.

It makes sense that this is not included in Base, the real issue here was the confusing error message, which happens when a method explicitly throws an error with a message that it itself does not exist, and ending with the conclusion that the closest match is itself, again.

No, it is not. Implementation is not documentation of intention. The code is just the simplest possible that does what he wants, if one replacement does not introduce a token that may be recognized by the next call both implementations have the same behaviour (as observed by the return). However, the code offered may as well have a bug, if it just was created without considering the possibility that previous replacements may interfere with the next ones.

…and if you want your way, you can always write that loop and be explicit about it. To me, it’s not obvious which behaviour should be the “correct” and generalized one - one pass for each, exhaustion for all… the behaviour and the result is not obvious, and I think that’s a very good reason not to try to be smart about it (and introduce possible bugs, like what would happen with overlapping replacements - throwing an error in such a case at runtime is bad at best, since you’d suddenly have to wrap every call to replace in try/catch to cover that one edge case… especially since the compiler can’t get rid of the try/catch, since the type information itself doesn’t say whether or not an error will be thrown).

I’m well aware, I’ve read that thread :slight_smile: I disagree with your conclusion though, that’s all.

2 Likes

For what it’s worth, this variation manages to navigate between the existing methods.

function Base.replace(str::AbstractString, old_new1::Pair, old_new::Pair{String, String}...)
    str = replace(str, old_new1)
    for o_n in old_new
        str = replace(str, o_n)
    end
    return str
end

It’s quite a brittle solution though.

I agree - perhaps an ArgumentError explaining the rationale would be better in this case?

The generalized one is the simultaneous application, because both one pass each and the exhaustion for all may be defined in 1~5 lines using repeated application of the simultaneous application, on the other hand, if simultaneous application is needed and any of the others is implemented, then simultaneous application has to be implemented basically from scratch, because the other two are not useful for defining it.

I said nothing about throwing exceptions.

I have to agree the behaviour may not be obvious to everyone, because otherwise I would be assuming bad faith from you, but I also do not think anyone should just pass multiple replacements and do not query the documentation if they have some doubt about what could happen. I am almost sure other implementations of multiple replacements in other languages work simultaneously, I will need to check this another time.

No, precisely the case when two replacements would apply (e.g. replace("aabb", "aa" => "c", "aabb" => "d")) is ambigous. There is no “simultaneous application” where both replacements are done at the same time. The order matters, and as such you have to choose. There is no quantum superposition of resulting states.

I was referring to one of your comments in the thread you linked.

I’m certainly not arguing in “bad faith” here, I’ve just come to a different conclusion based on personal preference. No solution is arguably better than any other here, and as such it should be left to the user to decide how they want this to behave on Strings in their particular case. (As an aside, my personal preference would be that people wrap strings and define replace however they want on their own type. This would also help enforce the invariants that are often expected of things that look like strings but often aren’t, like e.g. filepaths are).

Please do! I’d be interested as well, especially in their reasoning for choosing one implementation over another or how users responded to their implementation. :slight_smile:

1 Like

It is not ambiguous. To be ambiguous we need a finished design definition for which it is not clear what would happen in such case. What I said in the other thread (in the post you just linked), was that:

What I would expect is that the first substitution that matches is used, or an error is throw. Seems intuitive to me.

This is not ambiguity, this is just a not finished design. If the final design is to throw an error it prevents the user to be surprised but can hinder actual use cases. If the final design is to use the first substitution that matches (because they are provided in a Vector or Vararg so they have an order relation), then the user may be surprised, but we do not hinder it from achieving “cbb” what is unambiguous in relation to the finished design decision.

Simultaneous was just an unfortunate choice of words that I expected to not have to explain because you said you did read my old thread. What I meant by simultaneous application is exactly what I explained in the post you just linked, i.e., that no replacement is applied to all the string before any of the others, for each character which may start a replacement then all replacements that apply are tested, and then if two or more match you can choose them by their order in the list of replacements (or throw an error, if this is the finished design decision), and then the next character considered is the first one that was not part of the last replacement. I find this to be the most intuitive behavior because all replacements happen over text that is present in the original string, and no replacement occurs because a text fragment created by a previous replacement (if this is desired simply make multiple calls to replace).