The meaning of “meaning”
A common theme that comes up when discussing type-piracy/method merging, etc. is that there “can only be one meaning” for a given function without namespace qualifications (I am being careful there to say function, not method). I just want to get some clarification from the developers on this, because clarity in communication would help for educating users, deciding what macros are reasonable, and what can be discouraged. (Of course, part of this is driven by the fact that most operators need to be in Base
to be practical, and that a large and expanding Base
and standard library ends up laying claim to a large number of functions.)
First, I would love to better understand what “meaning” means. What I believe you have in mind is that there is an expectation that the given “function” has a set of methods that operate on a set of types, which themselves are expected to have other methods. This is informally defined and unenforced, though it may be documented (e.g. https://docs.julialang.org/en/stable/manual/interfaces/) If so, then I would say that this sense of “meaning” is roughly equivalent to concepts (unenforced pre C++20) in C++, typeclasses in Haskell, etc.
Now, in this way of thinking (which is that there is some collection of abstract types and methods, informally enforced) “meaning” is related to both the function and the hiearchy of types which dispatch to methods within the function. For example, looking the size
function in base:
help?> size
search: size sizeof sizehint! Csize_t resize! filesize Cssize_t displaysize @nospecialize
size(A::AbstractArray, [dim...])
Return a tuple containing the dimensions of A. Optionally you can specify the dimension(s) you want the length of,
and get the length of that dimension, or a tuple of the lengths of dimensions you asked for.
This tells me that in Base
the only “meaning” of the function size
is something defined on AbstractArray <: Any
that has a particular interface associated with it. Of course, if I create MyType <: AbstractArray
then it would be a terrible idea for me to not have it follow this interface (unlike Concepts in C++ this is not enforced… but that is totally fine).
What exactly is “Type Piracy”?
Now, what would happen if I wanted to define a size
method on a completely unrelated set of types? In particular
abstract type NothingLikeAbstractArray end
type ConcreteNotAnAbstractArray <: NothingLikeAbstractArray end
Base.size(x::ConcreteNotAnAbstractArray) = 1
But note that NothingLikeAbstractArray <: AbstractArray == false
. There is no overlap of types whatsoever with the existing “meaning” of size
. It is unambiguous and there is no way whatsoever that the different “meaning” (i.e., the AbstractArray
methods) would ever be involved in dispatch. In can happily live in parallel.
This sort of thing has been called “type-piracy” or perhaps “mild type-piracy” and discouraged in the interest of having only a single “meaning” for size
, but I don’t see what is wrong with it?
Another example, in DifferentialEquation.jl
there is a solve
function with a particular “meaning”. The typical signature is along the lines of solve(prob::DiffEqBase.AbstractODEProblem{uType,tType,isinplace},...)
But what is wrong with me defining:
type MyModelNotRelatedToODE <: Any end
solve(mod::MyModelNotRelatedToODE, ...)
Again, this can happily live in parallel with the meaning of solve
in DifferentialEquations.jl
, even without namespace qualifications, since MyModelNotRelatedToODE <: AbstractODEProblem==false
, etc. If namespaces are keeping these “meanings” separate, then they are getting in the way of simple code in this case.
My Definition of Type Piracy:
Take as given a method f(x::A)
then if someone else were to define:
-
f(x::B)
forB <: A
= Pirate! -
f(x::C)
for(C >: A) && !(C <: A)
= Not a Pirate!
So, given size(x::AbstractArray)
and AbstractArray <: Any
then defining
-
size(x::DenseVector)
= Pirate! -
size(x::MyType)
forMyType <: Any
as long asMyType >: AbstractArray == false
= Not a Pirate! -
size(x::Any)
= Not a Pirate! (this is because it is not possible to influence the dispatch for theAbstractArray
types)
Should Multiple Meanings Be Discouraged?:
Why is this important? Because there are a lot of function names which are useful in different contexts, and it would be perfectly find to have using
for them all. Furthermore, I would guess that the vast majority of cases being discouraged as “type piracy” are “not a pirate!” according to my definition, where there is no issues at all having parallel meanings. In my mind, the idea “type-piracy” is very real, but it is only meaningful if you write methods which change where existing types get dispatched.
In particular, people coming from single-dispatch languages would be almost never be doing that - in part because they don’t even know about generic programming. In virtually all those cases, they would write type MyType <: Any
and then dispatch f(x::MyType)
. Unless f(x::Any)
was already created in a package - which would be a lousy package design - this is not type-piracy according to my definition.
Why Not Manually Merge by Declaring Methods in Existing Namespaces?
Lets say that we no longer discourage the “Not a Pirate!” scenarios above, then why not just have everyone manually merging them? The short answer is the fragility or the ordering of method definitions.
That isn’t to say that automatic method merging is necessarily the best approach, but it suggests there is no reason that multiple reasonable different “meanings” on disjoint types in the same namespace. Having a clean way to manually merge, and having it clarified in the core documents, would make this much cleaner. Now you may say “just use import instead of using” but that isn’t possible with operators, it isn’t possible with functions already defined in Base
, and (in the examples above) it is inconvenient without introducing any sort of “safety”.