Function name conflict: ADL / function merging?

Thank you for allowing a judgment call. Myself, I did not see a problem when I wrote the code because I saw nothing in the manual which prevents/ warns against such use of multiple dispatch. Indeed, GAP uses differently multiple dispatch (I would say, more broadly) than julia. And, you can find sometimes discussions in the GAP forum which warn against too many semantic uses of the same function name or operator. But, in general, the taste of GAP designers allows for a few (not more that 3 or 4 at most) unrelated uses of the same name. As a parallel I already made, 3 or 4 is usually the maximal number of unrelated meanings of an english word.

Sorry about conjugation, it was a misprint. p^q or p^q means ìnv(q)*p*q which agrees with usual mathematical notation and is contravariant with respect to q as expected. The other conjugation q*p*inv(q) is written in mathematics as {}^q p.

And for the tone, I did not mean disrespect but what I thought was wry humour.

Please tell me how a definition

function ^(p::Perm, q::Perm)

could break another package.

Other code could use the ^ function, expecting it to have the normal meaning of iterated multiplication and giving a wrong answer when that’s not what it does.

1 Like

How could they get a wrong answer if they use an Int as the second argument? And if they do not use a number as the second argument, why would they expect iterated multiplication?

If the Perm type is not shared then a definition like this is unlikely to be a problem. It’s still possible for generic code to not care about what either type is as long as the types implement some sensible notion of exponentiation. This code could break in confusing ways or give nonsense answers instead of breaking in the relatively clear manner of giving a no method error for ^. There is also a chance that some other code does something based on whether a ^ method exists for a type or not, which could cause a problem, but it’s fairly unlikely.

The bigger concern is that one might realize that there is actually a more consistent meaning of Perm^Perm that agrees Perm*Perm. If you’ve already used the ^ function to mean something different, then you would be unable to use the natural notation for this. This happened in Julia with several behaviors that we previously shared with vectorized languages like Python, R and Matlab. For example, the fact that exp(A) used to mean vectorized exp over an array A prevented it from having it’s more natural meaning of matrix exponential for square matrices.

4 Likes

One more note about ^ for permutations.
You will find the notation p^q for conjugation inv(q) * p *q in any textbook on group theory.
The notation n^p to apply p to n is a bit less frequent but it also appears in textbooks. In
textbooks, you will also find p_n and p(n) for that — but never p[n] because that is thinking about the implementation and mathematicians don’t do that :slight_smile:

So, the GAP designers are just following usual mathematics, which may clash slightly with your viewpoint. Which is why I said:

despite being designed by applied physicists, Julia is well designed and with a few changes could be used by theoretical physicists and even mathematicians. Of course some changes are needed: in Julia floating point is a first-class citizen and Rational numbers are second-class, while in mathematics it is the opposite, etc…

Yes, the invention of the broadcast operator . is a bright feature of Julia, which is why I will never complain if I have to write A .+ B instead of A+B. But I would complain bitterly
if I had to write Broadcastpackage.broadcast(A,B) instead if broadcast were not in base but defined in Broadcastpackage. And similarly I complain if I have to write p Perms.^ q instead of p^q.

This thread is full of ways that you can have the exact notation you want. If there could only be a single function named ^ as you’ve proposed then this would not be possible and you truly would be stuck with whatever behavior Base defines for that syntax.

Put another way, the way Julia works gives you the freedom to have ^ do anything you want it to mean, including the GAP style behavior, without affecting anyone else’s code. The way you’ve proposed it should work, as I understand it, insists that ^ have only one meaning—and that it be your way, not anyone else’s.

I think I finally understand your perspective on this. Namespaces are a way for Julia to try to enforce some sense of “concepts” (and maybe typeclasses in haskel, though I never quite understood that stuff) by voluntarily putting the operations on your types in that namespace. Nothing is enforced or formally specified - which is perfectly fine considering the disasters other languages (e.g. C++'s first attempt at concepts with enforcement).

To you, there can only be one concept which is active at any point in time without namespace qualifications (e.g. if base uses size then everyone not following the generic conventions on size should should require namespace qualifications). You have gone through painstaking efforts to make sure that everything in Base truly conforms to all of the expectations of this. If everyone using a function or operator is disciplined in ensuring that they keep things following the (often unspecified) generic interface then any type with those operations defined can work with any generic code. A noble goal, and I don’t mean that passive-aggressively in any way.

Sounds great… but in practice here is what I think are the issues:

  • It requires a huge amount of discipline and coordination for everyone involved. Much more so for everyone involved than having some the sort of argument-dependent lookup, or the single-dispatch style scoping.
  • What about perfectly valid set of generic interface requirements in parallel for a function called length, which nobody would ever intend to put through all of the generic machinery in Base? In particular, what if you don’t intend to use your type for anything generic at all (ie the vast majority of code outside of fancy libraries) Your choices:
    1. You could have it in a namespace on its own, but then you need to qualify the length call with the namespace each time… which seems redundant to the user because for MyType there is only one function that length possible mean. A similar argument could be made for multiple-dispatch, so this isn’t entirely a “one type per function” story.
    2. You could come up with a permutation on the length name which doesn’t clash with the Base, so that you can keep the function convenient to use.
    3. You could just implement the type as Base.length for your type, cramming it into the “global” namespace, and not worry about it because you are just working on a small bit of code (and not some fancy generic library).
  • I am willing to bet that a general user, and even many library developers, will choose one of the last two options, and both circumvent the whole point of namespaces. Convenience is important, and Julia is intended to be a convenient high-level language.
  • Even if you think that self-disciplined strictness is more important than convenience (which is a defensible position) it doesn’t mean users will be so disciplined.
    • Once they learn the “just put it in Base” workaround to make functions convenient, they will start to use it. And this abuse backfires, because the whole point of namespaces was intended to prevent this…
  • It is undemocratic.
    • I finally understand why that word was used before in this discussion. I am not using it in any sort of loaded way, and mean the language design rather than the Julia core developers
    • It means that whatever function name is in Base takes precedent. So if the standard library decides to add in a new function called foobar then it breaks anyone else’s code in the future that may have called a function foobar.
    • What is the easiest way around it? To cram your function into base (hence the workarounds proposed here).
    • You can declare reusing the function names in the global namespace verboten all you want, but people writing code won’t listen to it… Especially those used to simple single-dispatch languages.
  • If two libraries want to write the same generic operators on the same types, they are screwed because they both cram the functionality into Base. Why aren’t two versions of * on the same type valid, especially for abstract and generic libraries?
    • Isn’t this what namespaces were intended to eliminate? The point of namespaces are to choose which functions I want to associate with which types.
    • If the more traditional lookup of functions is implemented, this isn’t a problem. If you try to use both libraries at the same time, then it is ambiguous and it errors… at that point you need to make a choice.
    • Keep in mind that C++ partially has ADL to prevent people from shoving things into the std:: namespace, so this is a problem many languages have dealt with before.
  • If there is every incentive to add functions to the global namespace (i.e. Base) for convenience, then it requires a high degree of discipline for people writing functions not to do so. This sort of extreme discipline is not required if functions look into multiple namespaces for their types.

I think the fact that there is every imaginable incentive for people writing code to put their functions in the global (i.e. Base) namespace (even if you tell them not to) is symptomatic of the underlying problem. Namespaces are insufficiently decentralized compared to other languages.

1 Like

I agree with your post but I think that the problem has nothing to do with the distinction between single dispatch and multiple dispatch. It has more to do with the idea that the lookup has a semantic basis which is more than just dispatching on the types of the arguments.

1 Like

You are right. I was more suggesting how surprising it is for those coming from single dispatch but it isn’t crucial

I thought the possibility of multiple separate * functions was exactly what you objected to. Julia’s approach specifically allows you to have two packages with different definitions of * on the same types. In contrast, in python it is impossible to have, for example, different definitions of * for some type, say, integers.

module A
export *
*(args...) = Base.:*(args...)
*(a::Int, b::Int) = 42
end

module B
export *
*(args...) = Base.:*(args...)
*(a::Int, b::Int) = 43
end

You can now import .A: * or import .B: *.

That is exactly what happens; you get the uses of it must be qualified error.

2 Likes

Perhaps I should point out that nobody actually does the “shove it into Base” thing that you’re claiming is irresistibly attractive. It has only come up in this discussion because of people insisting that they want the ability to do this—which they have. Your option 3, which I would describe as “mild type piracy” is fairly common and also fairly benign. It’s hard to imagine what a function called length would return besides a number which describes the length of something, so maybe it’s not the best example.

1 Like

@jlperla probably is referring to structs in one module or the other, although the statement is not explicit. But, conflicts will still be common in any case due to ambiguity in the call signatures.

For instance, if I want to write a method for reduce where the reduction operator is merge, I can do, perhaps

function reduce(::typeof(merge), a::AbstractArray)

The method expects Dicts in a, but the container or or iterator may be of type Vector{Any}. On the other hand, SharedArrays has method with signature.

function reduce(f, a::SharedVector) ...

If i now call reduce(merge, shared_vector), there is an ambiguity. This is a “real world” case. Until recently, all of these names were exported to Base. The more functions and methods and types you have in one namespace, the more complex solving these ambiguities becomes. To mitigate this problem, best practice in general is to qualify identifiers and/or import only necessary identifiers.

I would characterize my option 3 as “shove it in base”. I guess we will see how irresistible it is after you get more intro level programmers. I agree that Option 1 hasn’t been used (in part because small groups of developers are able to coordinate on creating their own Base namespaces). The fact that it was proposed is part of the “symptoms” of why things don’t seem right.

The basic issue is that the Base namespace is serving two purposes, that have nothing to do with each other because of the connection between namespaces and concepts: (1) Base contains an implicit set of generic interfaces that you have in mind for types (e.g. the size and * and other interfaces you have taken care to make sure are fully consistent in all of your types); (2) a way to manually manage the global namespace of functions from packages. You have privileged the standard library interfaces in the global namespace in a particular way (i.e., the lack of “democracy”) because they are not decoupled. Put another way: If namespace qualifications are so great for everything, then why don’t we use Base qualifications for everything?

The first makes tonnes of sense, and the second is only necessary because you cannot dispatch to the namespaces based on the existing types (i.e. some sort of ADL with the ambiguity checks you already have). So it is up to the writers and users of packages to manually manage the global namespace scope by adding functions into the Base namespace, forwarding the functions within their own namespace into Base, etc. These may (or may not) have anything to do with the particular interface that Base has used.

Anyways, if you have reached the point where you understand why people like me, @Jean_Michel, and others are uncomfortable with this from a practical and theoretical basis, then we can stop. I only responded again to ensure that there isn’t the impression that the namespace woes are the same as other languages. They are not, and they are especially not with the typical first languages of users (i.e. Java, python, C++, OO with matlab, etc) because none of these conflate the interfaces of the standard library with the management of the global namespace.

Hopefully we have made a dent in your thinking, but I am mostly interested in pragmatism to ensure my code looks clean and as close to whiteboard math as possible. Until/if the majority of tutorial code out there has full namespace qualifications for everything, my advice to confused people is as follows:

  • Try to copy/paste the code with a using the sample code directly
  • Hopefully you won’t get conflicts with 2 different libraries. If you do, then you might need to import parts of them, or talk to the library writers. With a curated list of libraries this is less likely to be an issue in the next few years.
  • if you try to create a function name and it won’t let you then call it Base.myfunction and it will probably work.
  • If you upgrade Julia and suddenly one of your functions doesn’t work, then they might have added that function name to Base. If you change the function name to Base.myfunction then it will probably work.
1 Like

Not at all. I love having different operations and namespaces, and want to be forced to choose when things are ambiguous… But what does creating * have to do with Base? This is just manually messing around with it as a global namespace… And I am disadvantaged compared to the standard library components, which get precedence in the global namespace

1 Like

I’d like to understand this better — is it just that using Base is in every module by default?

I don’t really get this. See my example above that defines *. It let me do that even though Base also defines *. What is it that’s not allowed?

1 Like

I think I see what you mean here but it’s kind of an uncharitable interpretation of our intent. It’s as if we’re saying “in order to put something in the global namespace, it must follow the semantics we’ve specified”. But that’s not our intent at all. Rather it seems to fall out of this process:

  1. Since using Base is present by default, the easiest way to give somebody a method is to extend a Base function.
  2. If you extend a Base function, your method is supposed to have the “same meaning” in some sense.
  3. Therefore to export functionality it needs to follow the specs in Base.

Is that right?

One thing I can imagine doing is giving precedence to explicit using X statements over the implicit using Base statement — so if X exports a name also used in Base, you will get that instead of an error that it needs to be qualified.

Not exactly, but the using Base in every module as a default is the key symptom of the underlying problem!
Without the using Base, Julia is completely unusable. This is not the case with other languages. In fact, with C++ people would rarely go using std;

What this shows is that in order for the standard library to be usable, it needs to be given precedence in the argument lookup. But that isn’t necessary. The reason you do using Base; by default isn’t that Base is somehow special, it is that it is the only way to put it in the global namespace for lookup. And if anyone else wants to have their operator * usable, they have to add it to using Base;!

Now, up until now you have been able to keep all of the generic interfaces for the entire standard library consistent, so that size has the same generic requirements no matter what type you have in the standard library. Same with *… but what if down the road you find that this is unsustainable, and that there are two related (but not identical) sets of interfaces for * and.or size. Normally, since you have a correspondence between namespaces and the generic interfaces, I would say that you should just split the two * and size until into separate namespaces… Base.Concept1.size and Base.Concept2.size, etc. But you can’t! Because they only way to have things work in practice is to keep things in the global namespace.

Hopefully that helps. If there are any C++ or Java or D experts, I suspect they could do a much better job of explaining this than I could.