Function name conflict: ADL / function merging?

This is both very clear and a little confusing to me. It seems to me that the problem we were discussing is
when the user expects two functions with the same name coming from different packages to be equated,
while the default decision of the language is to not equate them. So, if I understand, this comes from
the idea the current language design has that the informal meanings are always different? And I misunderstand/ do not like the design because I have the idea that on the contrary, the informal meanings are generally compatible?

1 Like

Yes, that might be it. If you and I both make functions f, by “default” they are different functions. The language assumes the name is incidental; just the variable name you happened to pick to refer to the function. Of course for most variables that’s the right thing; I can write x = 0 and you can write x = 2 in a different program and clearly the names are just a coincidence. But people have a different intuition about top-level functions.

1 Like

The language and its use interact. The best case scenario is having a set of semantic building blocks that map in a simple and transparent manner to the underlying language. “Meaning” is such a building block.

This is a somewhat specious argument in this topic: if you accept this line of reasoning, then surely there can be no problem with symbol clashes, since the definition of namespaces and related constructs is formalized and everyone can just use it as is.

OK, thank you very much. I have the feeling that we finally reached complete understanding in this thread.
I am thus confident you will find the best solution for us all.
Thanks again for your beautiful language!

1 Like

It is not because something is formalized that it cannot be inconvenient.

2 Likes

This is my favorite sentence from the entire thread.

6 Likes

No. You do not want to give users Option (3). This is a complete misunderstanding why method merging can fail in fantastic ways. The problem with

deal with this type of type piracy in the rare moments they use it from undisciplined libraries

is that nobody really will know when this happens! This kind of guilting assumes that someone did something wrong! But the issue with assuming that everyone’s meaning is the same is exactly the fact that two perfectly fine programs don’t mesh when they are put into the same room. Auto-merging is a programming structure where two good programs, built with good programming practices by smart and matriculates authors, are put together to give you something that doesn’t necessarily work like either programmer intended. That’s not a good structure to have!

The example I gave:

This has nothing to do with anybody making a mistake. Both programs can be correct, but without any warning Package B will modify how Package A works even if these two packages never heard of each other. This kind of non-locality is why type-piracy isn’t allowed. To make it more explicit, we do something like:

f(x::MyNumber) = x.value+1
f(x) = x
g(x,y) = f(x) + f(y)

while Package B adds:

f(x::Number) = x^2 # This is the square function

The problem is that we assumed that everyone would use f to mean the same thing, so merging in Package B’s f into Package A’s will change the result of f(MyNumber(1.0),2). Again, the two don’t need to know that each other exist. Package A built perfectly fine generic code, and Package B built their own, but they compose to something that doesn’t keep the same meaning.

Non-locality without warning, breaking any piece of random code, and nothing can be walled off or safe from it (without explicit typing on concrete types). Code doing something that none of the authors intended. Do you really think that is something that is easy for new users to debug? This isn’t something that experts can debug well! Since packages are just modules and Main is just a module, every function you define in the REPL would have a chance to re-define some package code by hitting the wrong name at the wrong dispatch level.

Of course, the solution is to put a safety wall so this doesn’t happen all of the time. That’s namespacing. And as noted earlier in the thread, every single language (except MATLAB…) has namespacing, so it’s pedagogically no different.

5 Likes

You are missing my point. I love namespaces. My issue with Julia is that I think they are broken. Lets go back to this whole “meaning” thing people keep bringing up.

Now, in my current Julia v0.6 (even before using any libraries, which would add all sorts of them), try

julia> methods(*)
# 185 methods for generic function "*":

Now, are those 185 methods really conveying the same “meaning” in the abstract sense everyone is talking about? Or is it is simply that Base is a convenient place to circumvent the lack of a reasonable way to have methods/functions in their own namespaces and still be usable? Similarly,

julia> methods(size)
# 89 methods for generic function "size":

89 consistent “meanings” for size (before loading libraries)? I don’t buy it.

In fact, the necessity to coordinate on a shared namespace makes things much worse… the whole point of namespaces is that I can have the same function on the same arguments, but if I have to cram everything into Base (or BaseSolver or whatever) then that isn’t even possible!

Now compare that to every other language I know with operator overloading, function overloading, and single-dispatch. As long as the functions are not amgibuous given the types, you can use whatever namespaces you want for your functions (or scope them based on the type, acting similar to a namespace in some single-dispatch languages). Two people write an operator* for the same type? No problem… just choose which namespace you want.

I don’t know every language out there, but from what I have seen, Julia is the odd-one out in forcing people to circumvent namespaces in order for operators (or functions in general) to be usable.

Right now I’m using code filled with NLSolversBase.gradient(!), ForwardDiff.gradient(!), and DiffResults.gradient calls.
When I look back at code I haven’t touched in a week, it’s pretty clear what each of these is doing without context: DiffResults.gradient(x)? I instantly know that “x” must be a DiffResult, without having to look anywhere else for context. Time saver.
DiffResults and ForwardDiff’s gradient function could easily have been merged – they’re by the same author in the same organization! I’m not complaining that they hadn’t been.

When people like ChrisRackauckas give examples like module A’s f(x) and B’s f(x::Number) getting merged, all I can think of is endless painful headaches. Not only in trying to use A and B at the same time, but when I’m trying to write code myself. Suddenly, I’m discouraged from making my functions generic.
I’m now suddenly incentivized to be hyper specific in all my function definitions. I don’t want to risk my code breaking all the time, so when I say f(x) = x, I darn well mean f(x) = x !! So because I’m imagining floating point numbers, I write f(x::Float64) = x, and now my code doesn’t break when someone loads module B. Yay.
Except, module B’s code broke.

And DualNumber Bob now suddenly can’t use AD to find derivatives with my package.
My functions that were written specifically for f(x::Vector) instead of f(x::AbstractVector) now don’t support GPUArrays or TrackedArrays. Etc. Maybe I’d opt for obscure names for internal functions, so I can support AD co without worrying?
I don’t want to have to constantly try to imagine what everyone else may ever do – not just with my code, but any conceivable totally unrelated project!

This just seems so immensely complicated, to what end? For the sake of the new user??

In R, when you load tidyverse, it warns you it is masking a few functions from base. You can access them via namespace::function. People are used to that from other languages, and can’t get away from that within R at least.
Julia does the same thing (although without tidyverse’s pretty color-coded warnings), and tells you the to be explicit with Module.function. Explicit, clear, and easy to understand.

EDIT: I apologize if I’m being dramatic.
It would be nice if perhaps there could be merging for functions dispatching on your own types, where you don;t have to worry about conflicts.

But I feel like any kind of merging is immensely more complicated and harder to understand, for the benefit of superficially appearing a little simpler.
My life seems much easier with the status quo.

2 Likes

Just a few examples of what I am getting at. Keep in mind that most people learning julia as a second language would learn python, Java, or Matlab first (all of which are consistent with my point).

But let me make my point with C++ because it is typed and has operator overloading. The following code can be copied into any online C++ compiler. Java would be very similar in the scoping rules

namespace MyNS{
    struct MyType{
        auto f() {return;} //member function
    };
    
    auto operator + (MyType, int) {return;}; //Overload of MyType and integer.
    auto g(MyType) {return;} //Function of the type, in my namespace
}
namespace MyNS2{
    struct MyType2{
        auto f() {return;} //member function
    };
    
    auto operator + (MyType2, int) {return;}; //Overload of MyType and integer.
    auto g(MyType2) {return;} //Function of the type, in my namespace
}
int main()
{
    using namespace MyNS;
    using namespace MyNS2;
    //Using the first ones.
    MyType mt;
    mt.f(); //Member function
    mt + 1; //Overloading
    g(mt); //Member function

    //Using the second ones.
    MyType2 mt2;
    mt2.f(); //Member function
    mt2 + 1; //Overloading
    g(mt2); //Member function
}

So, you will notice that there are no issues with having the same named functions, operators, etc. in any of the namespaces… and it doesn’t require cramming operator+ into a Base!.

Moreover, if you don’t want to do the using namespace it still works. Swap out the main for the following

int main()
{
    //Using the first ones.
    MyNS::MyType mt;
    mt.f(); //Member function
    mt + 1; //Overloading
    g(mt); //Member function

    //Using the second ones.
    MyNS2::MyType2 mt2;
    mt2.f(); //Member function
    mt2 + 1; //Overloading
    g(mt2); //Member function
}

Note that it figures it out, even without using because there is no ambiguity.

Compare that to a Julia based approach. First of all, we know that it isn’t possible to have a using for both MyNS and MyNS2 without putting them in a common namespace. But what would this code look like without the using? To have the f and g in separate namespaces you would have to do something like

mt = MyNS.MyType()
MyNS.f(mt) # Redundant namespace!
MyNS.g(mt) # Redundant namespace!

mt2 = MyNS.MyType2()
MyNS2.f(mt2) # Redundant namespace!
MyNS2.g(mt2) # Redundant namespace!

etc. And I don’t think you could do the operator in separate namespaces, unless I don’t know a trick.

The same thing with single-dispatch is true in Java, D, Python (I think), and enumerable other languages. The suggestion of people in this thread is not to get rid of namespaces, but rather to make the following possible (since there is no ambiguity in either f or g)

mt = MyNS.MyType()
f(mt)
g(mt)

mt2 = MyNS.MyType2()
f(mt2)
g(mt2)

Again, this is not about calling with a dot… I would personally rather go f(mt) than mt.f() in all cases. But all languages with single-dispatch that I know of behave in the way I have described. I suspect that most ones with operator overloading also do. We could come up with enumerable examples in other languages if it would help crystalize the point of why Julia is the odd-man-out for this, and how someone coming from Python, Java, C++, (and even Matlab if they used the ugly OO) would be confused.

2 Likes

Hopefully, yes. If you find that they don’t, that is worthy of discussion, either to unify the interface or at least document the discrepancy. But designing good interfaces is a difficult and iterative process; very few are specified explicitly, and most are conventions rather than formal descriptions.

Please open an issue if you find a problem.

I don’t understand your point here, many people have contributed very detailed examples above on how to solve this problem. Whether it is “convenient” is subjective, but surely it is possible.

Perhaps a better comparison would be with languages that have multiple dispatch? Again, as pointed out multiple times, this usually meshes better with Julia’s namespace setup.

You are not “circumventing” namespaces, you are working with them. Think of them as a tool, not a obstacle.

2 Likes

Thank you for the C++ example, I think it’s a helpful way to discuss this problem.

So the main function would of course live in an implementation (.cpp) file as opposed to a header file. Now while not everybody in C++ land agrees whether having using directives in implementation files is a good idea, there really seems to be a consensus that having using directives in header files is very bad style:

In Julia, there is (thankfully, in my opinion) no distinction between a header file and an implementation file. So in effect, every file is a header file (as well as an implementation file), and thus C++ best practice would have it that use of the Julia equivalent of using namespace statements should be avoided.

Now that doesn’t necessarily mean that Julia’s using should currently be avoided if you believe that these C++ best practices are worth adhering to; Julia’s using is not exactly the same as C++'s using namespace. But if the type of behavior in your main function were to happen in Julia code as a result of using MyModule; using MyModule2, then I would think that this hypothetical version of Julia’s using is sufficiently similar to C++'s using namespace for the C++ best practice rule to apply.

Edit: I reread your post more thoroughly, and I think the second example (without the using namespace statements, i.e. argument-dependent lookup) is actually very interesting though. This may be where there could be some space for Julia to be improved.

1 Like

I think this is a good read for people who want to understand @jlperla’s point regarding argument-dependent lookup: What's In a Class? - The Interface Principle, especially the section “Introducing Koenig Lookup”.

So what, we just tell people not to write functions that take numbers?

1 Like

I think we can again learn from C++'s ADL approach here. The following is valid in C++:

namespace A {
    struct MyType{ };
    
    template <typename T>
    auto f(MyType, T) {return 1;};
}

// note: no using namespace A!
int main() {
    A::MyType m;
    f(m, 2);
}

That is, you can have generic (templated) functions that are still found correctly using argument-dependent lookup, by virtue of the fact that at least one of the arguments is specifically of a type defined in the same namespace (module in Julia) in which the function is defined.

If two user modules A and B define a function foo(::Number), then of course there is ambiguity when foo is used in a third package that uses both A and B, but argument-dependent lookup would not apply because Number is not defined in A or B.

A sketch of what this could like in Julia:

module A
struct AA end
foo(::AA) = :A
bar(::Number, ::AA) = :A
baz(::Number) = :A
end

module B
struct BB end
foo(::BB) = :B
bar(::Number, ::BB) = :B
baz(::Number) = :B
end

module C
import A, B # import, not using!
foo(A.AA()) # fine, calls foo in A
foo(B.BB()) # fine, calls foo in B
bar(A.AA()) # fine, calls bar in A
bar(B.BB()) # fine, calls bar in B
baz(1.0) # bad: ADL not triggered, so baz not defined.
end
1 Like

89 consistent “meanings” for size (before loading libraries)? I don’t buy it.

I just checked all of these and they are all consistent:

  • All the methods of * implement some form of multiplication.
  • All the methods of size behave in this manner:
    • they return a tuple of dimension sizes when called with a single argument
    • they return the size of single dimension when called with two arguments
    • they return a tuple of the requested dimension sizes when called with three or more arguments

If I had found any methods stuck in there that were doing something else, I would have immediately opened an issue to fix it. Incredulity at the number of methods for such simple function probably has to do with not quite being used to how heavily Julia uses types and dispatch to implement behaviors that are consistent as abstractions yet highly polymorphic in implementation. It’s also possible that some of these could be condensed into a smaller number of more general method definitions, but we can fix that at any point since it doesn’t change the meaning or behavior of the functions.

Incredulity at the consistency across so many types and methods may come from familiarity with class-based languages where x.f and y.f may not be related at all, let alone mean the same thing. The fact that the f in f(x) and f(y) in Julia is the same object has kept us very vigilant about being consistent about abstract meaning, which in turn makes writing highly generic code more feasible in Julia than in any other language I’ve encountered. If x.size() might returns a tuple for one type and a scalar for another type and an array for another type, how are you going to write code that works across all kinds of x?

The litmus test here is whether one can write generic code that uses these functions without knowing or caring what specific types and methods implement them. This is the case in Julia precisely because we’ve gone to such great pains to make sure that all our generic functions are conceptually consistent. Consider string exponentiation—i.e. repetition in the string monoid where multiplication is concatenation. This has a specialized definition for performance but you can just as correctly call the generic Base.power_by_squaring function which is used to implement many other generic ^ methods:

julia> "abc"^10
"abcabcabcabcabcabcabcabcabcabc"

julia> Base.power_by_squaring("abc", 10)
"abcabcabcabcabcabcabcabcabcabc"

julia> @which "foo"^10
^(s::Union{AbstractChar, AbstractString}, r::Integer) in Base at strings/basic.jl:623

julia> @which Base.power_by_squaring("abc", 10)
power_by_squaring(x_, p::Integer) in Base at intfuncs.jl:186

The fact that these will produce equivalent results stems directly from the fact that they implement the same concept. Similarly, you can call size anywhere and know that it will behave consistently regardless of what type you call it on. For example:

julia> size(x^2 for x = 1:10)
(10,)

julia> @which size(x^2 for x = 1:10)
size(g::Base.Generator) in Base at generator.jl:117

How a generator knows its size is quite different from how an array knows its size, but the concept is the same and you can call the size function on an array or generator and not care which since they behave the same and mean the same thing. Julia is far more disciplined about this kind of consistency than any other language I’m familiar with and, as a result, it is far more possible to write generic code that actually works in Julia.

7 Likes

GAP is a language with multiple dispatch like julia, and I have ported GAP permutations to julia recently, following the conventions of GAP. In GAP, the ^ operator has 3 different uses for permutations:

If n is an Integer and p,q permutations then

  • p^n raises p to the n-th power
  • n^p applies the permutation p to the integer n
  • p^q conjugates p by q: inv(q)*p*q

Since all these methods for ^ are in the same package, I had no trouble at all writing this.

But, if I follow the thoughts underlying your post, I just did something illegal and the thought police is coming to arrest me. Forget any ports of GAP to Julia…

1 Like

If your ^ is separate from Base.:^ then there’s no problem. If you extended Base.:^, it’s a judgment call whether it’s ok. If GAP users would expect permutation-^ and number-^ to be the same concept/function, then maybe it’s ok.

1 Like

This tone isn’t really constructive. You’re free to implement whatever you want in your own Julia code. In this thread you have been arguing for fundamental changes to how the language works and how we organize and think about our standard library.

From the Julian perspective, yes, the way GAP uses the ^ operator for three very distinct meanings is considered to be bad form. The first meaning of iterating a permutation is defensible since that is multiplication in the group of permutations. Writing the second meaning as ^ seems odd: the typical mathematical notation for that would be something like p(n) or p[n]—both notations available in Julia. p^q is also a fairly non-standard notation for conjugation in a group. It seems like writing q*p*q^-1 would be clearer and more standard or just writing conjugate(p, q) seems pretty clear to me.

In my experience, the kind of operator punning in your GAP example usually ends up being regrettable and leading to hard-to-find bugs and unfortunate corner cases. But other people—including the creators of GAP—have come to different conclusions and find combining different meanings into a single operation convenient and appealing. You are free to implement a GAP.jl Julia package that defines a new, independent ^ operator that behaves just like the GAP one. Ironically, the very language feature that you want to take away is one that gives you the freedom to do this: the ability to create a new, separate ^ function in a different namespace.

You can also commit “type piracy” in your code by adding GAP-like behaviors to the built-in ^ function, but that can break other packages and people would be understandably reluctant to use your packages because of that chance. Even if you don’t publish your packages, this kind of piracy means there’s a chance that a change to Base could break your code when you upgrade Julia, but that may be a risk you’re ok with taking.

9 Likes

This is a case where one might want to define

export ^

^(args...) = Base.^(args...)
^(x::Permutation, y::Permutation) = ...
...

That way, those who use ^ from this package get a ^ that includes both Base definitions and your extensions, without disturbing any other users of ^.

5 Likes