In case of method ambiguity, choose method with tighter first argument

proposal
#1

I was wondering whether in principle one could mostly get rid of method ambiguities by adding simple rules like “in case of ambiguity, the method with the most specific first argument wins (if the first argument is equal the second argument is compared, etc…)”.

I’ve found the current behavior annoying when(for example) trying to overload reshape for StructArrays. I would like to simply forward the reshape operation to the various field arrays of the StructArray but (after loading OffsetArrays) there are 9 methods for reshape(v::AbstractArray, args...), some of which with rather tricky signatures, and one would need to overload all the signatures to avoid ambiguities. If a tighter first argument implies the method gets chosen, I could simply do:

Base.reshape(v::StructArray, args...) = ...

A side benefit would be (at least intuitively) that selecting the appropriate method to call would become a much easier problem, not sure whether this could help with latency issues or is unrelated.

Is the concern that selecting between two ambiguous method based on first argument could actually surprise users and they would prefer a clear error message?

1 Like
#2

In the case of reshape, the only method you need to overload for 1-based reshapes is

Base.reshape(v::StructArray, dims::Tuple{Vararg{Int64,N}})) = ...

If you also want to support offset axes, ideally you’d just also define

Base.reshape(A::StructArray, inds::Tuple{UnitRange,Vararg{UnitRange}})

and let OffsetArrays do the same sort of consolidating down to that canonical endpoint like the Base methods do. Actually, I think that’s probably something that Base could/should in fact do.

Defining your own array that can handle offset axes itself is indeed awkward; OffsetArrays was initially designed thinking it’d be the “only” offset array implementation, but that’s clearly not turned out to be the case. This is something we need to make better.

#3

Yes, handling tuples left-to-right is a possible approach to method specificity. In fact it has been used in some multiple dispatch languages, leading to competing designs of “symmetric” vs. “asymmetric” multiple dispatch. Many of us have long felt that favoring earlier arguments seems arbitrary. It’s hard to make the case that calling f(x::Int, y) over f(x, y::Int) is correct or satisfying in any way. I think the most that can be said is that for certain functions, some argument slots are obviously more important than others — e.g. the first argument of getindex or the second argument of higher-order functions like map(f, A). Making the rules function-specific is really too complicated though. Then there are other aspects of our type system (Unions, type parameters) that create other kinds of ambiguities not clearly resolvable by a left-to-right rule.

It would be interesting to see how many ambiguities present in some set of packages would be resolved by an alternate specificity rule like this, and in turn how many of those newly-unambiguous methods would be deemed “reasonable” by a human.

10 Likes
#4

It isn’t, necessarily, but AFAICT that is not the point of this rule. It just resolves some ambiguities in a manner which can be considered somewhat arbitrary, but is well-defined, and easy to reason about.

That said, AFAIK most (all?) of multiple dispatch languages using the left-to-right rule have

  1. no dispatch on parametric types, and
  2. demand tighter congruence of method signatures than Julia (eg rand)

so it is not trivial to tell a priori if there are any tricky corner cases.

#5

Thanks for the reply, I really appreciate the careful design in Base to make sure that there generally is just one “public” signature to overload to get the correct behavior for a custom array type. Indeed the “clumsiness” of supporting reshape with offset axes is the exception rather than the norm and in principle I would also think it should be possible in Base to “forward” all signatures to a common method (so that the custom array package developer doesn’t need to worry with the various Base.IdenitityUnitRange etc…).

#6

My proposal is mainly a practical one. I agree higher order functions are an interesting thing to consider as clearly it is more natural to overload the iterator type than the function and giving priority to the first argument here would be in practice not always correct. For example overloading reduce for some functions (like for vcat here) will cause ambiguity with let’s say the reduce(f, t::IndexedTables; select) method from IndexedTables and choosing the typeof(vcat) overload would not be a solution.

From a conceptual point of view I can maybe argue that there are cases where a method is just “forwarding things” and the easiest solution is to give priority to this method.

Let’s say for example that I create a custom array type that holds a color as well (the example is silly, but similar things can be used in practice, for example there was some discussion on having tables with metadata).

struct ArrayWithColor{T, N, A} <: AbstractArray{T, N}
    parent::A 
    color::RGB
end

This will have a few overloads of the type:

import OtherPackage: f

function f(v::ArrayWithColor, args...)
    parent, color = v.parent, v.color
    res  = f(parent, args...)
    ArrayWithColor(res, color)
end

Even though I understand that having the method selection rule depend on the function is too complicated, would it maybe be possible to give “higher priority” to some methods? Let’s say that in case there are ambiguous methods, the method with “high priority” gets selected (if there are more than one, it’s again a method error).