Are _symmetric_ multimethods worth it?

seertaak · February 2, 2021, 4:52pm

Multimethods are awesome. I think anyone who’s interested in Julia will agree with that. But I would like to ask an “out there” question: are symmetic multimethods - vs non-symmetric ones - worth the trouble?

Reasons I’ve found in the literature to prefer symmetric multimethods:

They’re more “natural”: why prefer one argument to another?
Better error messages: since type checking is now solely dependent on relative specificity, we capture more errors rather than tie-breaking to the function with most specific left-most argument.

(That’s all I could find. If anyone else can point me to other explanations, or indeed papers that discuss the issue of symmetric vs. non-symmetric MM, that would be kind.)

Now let’s list the negatives:

Type checking non-symmetric multimethods is easy and efficient. Just compare specificities between methods argument-by-argument, left to right. Implementing symmetric multimethods is basically still a research topic - although one handled quite admirably by Julia.
According to some articles - which I’ll admit to not fully grokking - it’s impossible to do separate compilation of modules using symmetric multimethods, but it’s apparently straightforward to do with non-symmetric multimethods.
Arguably non-symmetric multimethods are more predictable for the programmer in the trenches, for exactly the same reason that the type-checking algorithm is easier to implement: you can basically run the type-checking algo in your head.
It’s manifestly the case that in the real world, you get several methods that would accept the particular arguments in question. In Julia, my understanding is that in certain ad-hoc cases, specific tie-break rules are inserted, simply to avoid annoying the crap out of users. If you’re going to do this, why not just settle on a clear, easy to remember, rule, regardless of how arbitrary it is?
Moreover, a generation of programmers have been trained to view the left-most argument as special. Instead of telling them to ditch that thinking, why not expand on it and say, yes, that argument is kinda special but now we’re relaxing things so it’s not the only one that’s special?

What I’m hoping for is some slam dunk example where non-symmetric multimethods just yield a surprising and bad result. I honestly can’t think of any obvious ones, but my imagination is sadly limited and there aren’t a lot of (any, actually) statically-typed languages with non-symmetric multimethods, so it’s hard to find examples.

artihmetic operations? craziness there? I don’t think so. If think non-symmetric MMs can cope with these just fine – after all, they’re still using the extra arguments. By the way, if you’ve ever written 1.0 * x * ... to make sure that the expression is converted to floating point as soon as possible, you’re kind of thinking like a non-symmetric MM person!
in the case of “OO” programming it’s hard to see how non-symmetric MMs suffer. Actually they seem to me more natural. Maybe I’m missing something.

I’ll finish off by saying that it appears that a lot of very smart people gravitated quite decisively to symmetric MMs, despite the manifest difficulty for language implementer. Since they’re likely a lot smarter than, I’m probably missing something.

I invite you to set me straight

[Oh, and sorry for asking a question that’s only perhaps tangentially related to Julia. I hope it doesn’t bother the community.]

Mason · February 2, 2021, 5:07pm

I’m not sure what exactly you mean by a symmetric or asymmetric multimethod, but some googling suggests that it’s a strategy whereby method ambiguities are resolved in some asymmetric way, usually left right ordering E.g. if I define

f(x::Int, y) = 1
f(x, y::Int) = 2

in julia, this would result in ambiguities:

julia> f(1, 1)
ERROR: MethodError: f(::Int64, ::Int64) is ambiguous. Candidates:
  f(x::Int64, y) in Main at REPL[1]:1
  f(x, y::Int64) in Main at REPL[1]:1
Possible fix, define
  f(::Int64, ::Int64)

but in a system with asymmetric multiple dispatch, it might say break ties by going left to right in the arguments, so in this case, since f(x::Int, y) = 1 is more specific in the first argument, this is the method that gets called.

Is this what you meant?

I think this is an interesting approach to resolving method ambiguities and it makes some sense to me as a general strategy for approaching this. I could definitely imagine it can lead to some undesirable effects though even if I don’t have a concrete example in mind.

I think currently, due to single inheritance, method ambiguities aren’t a gigantic problem in julia, they’re annoying, but not show stoppers. However, many people would like a native trait based dispatch system and with that I’m sure we’d almost immediately end up in ambiguity hell without a way to resolve them.

seertaak · February 2, 2021, 5:10pm

Yes, @Mason , that’s exactly what I mean. Non-symmetric multimethods are still fundamentally multimethods in that dispatch is dependent on the types of all the arguments passed to a method. Where they differ is that non-symmetric MMs progressively constrain the set of permissible methods from left-to-right. OTOH, in symmetric MMs the order of the arguments is ignored, and only the relative specificities are considered. (This is obviously results in more errors in the symmetric case - which is often proffered as an advantage - but I would argue it’s a double-edged sword.)

Oscar_Smith · February 2, 2021, 5:11pm

With non-symetric, how would you be able to specialize matrix multiplication on the structure of the second argument? (Ie mul(X::AbstractMatrix, Y::SparseMatrix)). I think you would have to write separate copies of the method for every type of AbstractMatrix.

seertaak · February 2, 2021, 5:12pm

So IIRC there was an excellent talk by none other than Jeff where he illustrates a circular inheritance graph involving Float, and also points out that the Julia compiler does the tie-break thing I mentioned. I was hoping someone from the Julia team can either confirm or tell this isn’t the case.

Mason · February 2, 2021, 5:15pm

With asymmetric dispatch, the asymmetry is only for breaking ties. E.g. if there would be a MethodError due to ambiguity, then you look at the order of arguments.

mbauman · February 2, 2021, 5:47pm

It’s worth pointing out that we used to proactively warn on ambiguous cases. But in many cases you don’t need to worry about ambiguities unless they’re called.

In many cases method extensions are about optimizations… and in such cases there may be a better option as to which of the ambiguous methods should be called. It’s not at all obvious which one will be the better choice, but there are times where I wish we just picked one instead of erring since I often care more about the behavior than the performance. At the same time, I’d want to be able to manually specify which one is the better option or implement a combined optimization.

As a simple example, I can implement: (::AbstractMatrix) * (::AbstractVector) in a very generic way. I can further implement a SparseMatrix, OneHotVector, and specialize some * methods for them:

(::AbstractMatrix) * (::AbstractVector) = # Fully generic
(::SparseMatrix)   * (::AbstractVector) = # Exploit sparsity of the matrix!
(A::AbstractMatrix) * (v::OneHotVector) = A[:, findfirst(v)] # Exploit one-hot-ness of the vector!

Now we have an ambiguity if we ever call (::SparseMatrix) * (::OneHotVector). Which should win? Well, in this case, it’s probably the OneHotVector method that will do best because its implementation will happily hit the getindex method that SparseMatrix has provided. But the situation could just as easily be reversed.

There is another case where it’s the behavior that matters — and it’s not a simple optimization. A similar construction can be made for (::Number) * (::Unitful) vs. (::Number) * (::Interval). What should happen if you ever multiply a Unitful value by an Interval? Do you get a Unitful{Interval}? Or an Interval{Unitful}? They’re subtly different and behave differently… and will likely need special handling.

seertaak · February 2, 2021, 6:59pm

Thanks for your informative answer. I agree with everything you say, but it’s not clear to me that your reply constitutes a substantive argument for symmetric MMs.

To answer your questions…

In Julia this will cause an error on invocation, I presume.
In “my language” you’ll call SparseMatrix’s mul.

I would have thought 2 is preferable – it’s not optimal but at least it compiles. Are you arguing otherwise? I want to point out two things in this context.

If you really wanted to call OneHotVector’s mul, you could use tag dispatching to artificially force that guy’s mul to be more specific. (You’d need default arguments in any position to do this.) This is how C++ dispatches (statically in overload resolution, not dynamically, as Julia does, and C++ puts the tag-dispatching arg in final position, because C++ overloading is symmetric, like Julia) e.g. random-access iterators where appropriate, as opposed to, say, forward range iterators.
The left-primacy in non-symmetric MMs would almost certainly affect the design of libraries. Given that the “most expensive” thing in linear algebra involves matrices and not vectors, non-symmetric MMs would push you towards column vectors; i.e. you want the matrices on the left.

(Ad 2, I just thought of that; very possible that I’m wrong or missing something.)

Ok, now for your second question. The answer is Unitful{Interval}, at least in “my language”. If you flipped the order in your question, Interval{Unitful}. Again, in Julia, the answer is: error.

I’m not seeing a show-stopper for non-symmetric multimethods here…

mbauman · February 2, 2021, 7:42pm

I’m not really strongly arguing either which way — and indeed Julia could decide to give an ordering to (what are currently) ambiguities and it wouldn’t really change the language. It wouldn’t really change the language precisely because it’s an error right now. It’s the conservative behavior. It’d only change things in cases that are currently errors.

Note that some interfaces have explicitly been designed with this restriction in mind — and I think it drastically simplifies things. For example, getindex(A, I...) first dispatches on the indices (to allow for custom index specializations that return a typical index) then dispatches on the array itself. This is a case that doesn’t naively work in either system.

oxinabox · February 3, 2021, 12:17am

An associate of mine who programs in Julia and Python maintains his own multiple dispatch library for python (I guess because you can’t give it up)

He told me a year or so ago about how he was resolving ambiguities.
By having two classes of method.
I am going to call them regular and gold.

Dispatch followed the following rules:

If one is more specific then most specific wins, (regardless of class of either method)
If equally specific, then gold class wins over regular class.
If all are gold class: language is free to choose any.
If all are regular class: throw a ambiguity error.

Rule 4 gives regular methods our current behavior, so it’s a nonchange, and means when things are not clear then they are not clear.

Rule 3 makes gold class methods achieve what you want when you favour behavior. All gold class methods must do the same thing, or the behavior of the program is undefined.
The weakness here is you can’t enforce this because halting problem. Though you can lint it with decent heuristics.
I guess for testing/debugging purposes a way to force a choice and then a testing tool to force each choice in turn and test them.
In a lot of cases it doesn’t need to be strictly identical anyway.
E.g a NamedDimsArray{TrackedArray{Array}} and a TrackedArray{NamedDimsArray{Array}} are different types but they behave identically. Both just insert some independent bookkeeping steps to a bunch of operations before delegating to a method on the wrapped array, and then rewrapping.

Anyway. Rule 3 means gold class methods are defined as all having universal correct behavior.

Rule 2 that gold beats regular comes from this. Regular methods are not making as strong a claim about how universally correct their behavior is. So favor gold ones.

And Rule 1 says nothing matters if the inputs are indeed clearly more expected by the author of one of the methods than the others

Oscar_Smith · February 3, 2021, 12:24am

This is really interesting, but rule 2 feels backwards to me. If you have a method that does something different than the canonical implementation, that difference in behavior is probably a matter of correctness.

oxinabox · February 3, 2021, 12:36am

Rule 2 says nothing about behavior.
None of the rules do, except rule 3 which implies says if behavior differs program is undefined.

In rule 2 both methods are equally specific.
But one is Regular class and one is Gold class.
Probably in the case of Rule 2 the Regular and Gold class methods do do the same thing.
But the author of the Gold class method has made that extra promise of this constency and interchangeablity (rule 3) that the author of the regular class method didn’t (rule 4).
So they win, because they said that they do know what to do if ambiguous, where as regular class said they didn’t.

Hitting rule 2 should be rare in practice, because semantics when defining a function you should generally want either rule 3 or rule 4 to apply. So all methods should be gold or all should be regular.
Main advantage of rule 2 is it gives in upgrade path to handle a old package that doesn’t use gold class methods to still interact with a new package that is ambiguous, and still get a answer out

Remember also we can’t dispatch based on if methods actually have the same behavior or not.
Because that can be reduced by to halting problem.
(Define ambitious methods where one is the query, And the other is return 1 )

oxinabox · February 3, 2021, 1:09am

Aside:
If the compiler is allowed to just choose which method to apply in cases of ambiguity then there are all kinds of fun options.

chose oldest. This minimize invalidations.
chose randomly: sensible for testing.
chose from left/right: good for testing. (Needs tie break)
chose most likely to inline according to to the inlining heuristic. Useful since inlining code unlocks the optimizer
new heuristic that guesses which is fastest

jeff.bezanson · February 3, 2021, 6:43pm

Good question! This is an interesting topic. I don’t have that much better an answer than “left-to-right dispatch seems arbitrary to me”. The strongest argument might be that it’s better to get an error than to silently call a slow fallback method. Method errors are actually really important: they’re the primary way we have to answer “what methods should I implement?”, so the more errors we give, the more “helpful suggestions” you get.

I suspect you have to define separate compilation pretty narrowly for this to be true. My guess would be that this holds if you insist on statically finding all possible method errors. Whether there is a method error depends on which compilation units are loaded (since adding methods can introduce ambiguities), so you can’t analyze them separately. Fair enough, but personally I don’t really care. I will still separately compile whatever I want to, and nobody can stop me

Those rules can’t generally be replaced with the left-to-right approach, for example making Union{Int,String} more specific than Number.
You can also have ties within one argument e.g. f(::Tuple{Int,Any}) vs. f(::Tuple{Any,Int}). Perhaps that’s another reason left-to-right seems arbitrary: it makes the arguments themselves too special, when at least it would seem that the rule needs to be recursive somehow.

Thanks That is true, but the way we handle it now is to consider specificity after filtering out inapplicable methods. I believe that makes such cycles impossible. It’s just that you can’t consistently order methods only by their signatures. This issue can also exist within a single argument, so isn’t solved by left-to-right.

jlapeyre · February 3, 2021, 10:57pm

Is this open sourced? I see that there are a few multiple dispatch libraries for python.

oxinabox · February 3, 2021, 11:40pm

https://github.com/wesselb/plum

nb: I make no promise that what I described is what that does.
Because I might have got it wrong, or it might have changed since he described it to me.

Python Stheno uses it

Topic		Replies	Views
Julep: Taking multiple dispatch,export,import,binary compilation seriously Internals & Design namespaces	121	8588	May 18, 2018
Function Parameter Speculation #17168 Internals & Design	28	5102	May 22, 2017
Love I Julia : the need for a direct object notation Internals & Design proposal	47	5601	January 20, 2017
Julia protips General Usage	55	5429	May 31, 2019
Why does Julia not support multiple traits? Internals & Design traits	44	7805	August 14, 2017

Are _symmetric_ multimethods worth it?

Related topics