Difficulties with promoting numbers

jandehaan · May 19, 2018, 2:04am

Here is an example of two independent packages colliding when they overload the same method.

Package MyPackage

Module MyPackage
    struct MyNumber <: Number
        value
    end
    Base.promote_rule(::Type{MyNumber}, ::Type{S}) where {S<:Number} = MyNumber
    Base.convert(::Type{MyNumber}, x::MyNumber) = MyNumber(x.value)
    Base.convert(::Type{MyNumber}, x::S) where {S<:Number} = MyNumber(x)
    import Base: +
    +(x::MyNumber, y::MyNumber) = MyNumber(x.value+y.value)
end

Package YourPackage

Module YourPackage
    struct YourNumber <: Number
        value
    end
    Base.promote_rule(::Type{YourNumber}, ::Type{S}) where {S<:Number} = YourNumber
    Base.convert(::Type{YourNumber}, x::YourNumber) = YourNumber(x.value)
    Base.convert(::Type{YourNumber}, x::S) where {S<:Number} = YourNumber(x)
    import Base: +
    +(x::YourNumber, y::YourNumber) = YourNumber(x.value+y.value)
end

Innocent User

julia> using MyPackage
julia> using YourPackage
julia> MyNumber(1) + MyNumber(2)
MyNumber(3)

julia> MyNumber(4) + 5
MyNumber(9)

julia> YourNumber(3) + YourNumber(3)
YourNumber(6)

julia> YourNumber(7) + 1
YourNumber(8)

So far so good. But then

julia> YourNumber(5) + MyNumber(3)
StackOverflowError:

Would you consider this a case of type piracy?

If you think that this is not a big deal, watch Cassette.jl: Overdub your Julia code | Cambridge Julia Meetup (May 2018) - YouTube

ChrisRackauckas · May 19, 2018, 3:49am

That can only happen if someone knows the existence of both packages.

foobar_lv2 · May 19, 2018, 12:57pm

Unfortunately not: I use packages X and Y, not knowing about either MyPackage nor YourPackage, and the authors of X and Y know nothing about each other. X uses MyPackage, Y uses YourPackage, and I get into trouble. And I can get into trouble in very surprising ways, e.g.

module MyPackage
    struct MyNumber <: Number
        value
    end
    Base.promote_rule(::Type{MyNumber}, ::Type{S}) where {S<:Number} = MyNumber
    Base.convert(::Type{MyNumber}, x::MyNumber) = MyNumber(x.value)
    Base.convert(::Type{MyNumber}, x::S) where {S<:Number} = MyNumber(x)
    Base.:+(x::MyNumber, y::MyNumber) = MyNumber(x.value+y.value)
    Base.:-(x::MyNumber, y::MyNumber) = MyNumber(x.value-y.value)
    Base.widen(::Type{MyNumber})=MyNumber
end

module YourPackage
    struct YourNumber <: Number
        value
    end
    Base.promote_rule(::Type{YourNumber}, ::Type{S}) where {S<:Number} = YourNumber
    Base.convert(::Type{YourNumber}, x::YourNumber) = YourNumber(x.value)
    Base.convert(::Type{YourNumber}, x::S) where {S<:Number} = YourNumber(x)
    Base.:+(x::YourNumber, y::YourNumber) = YourNumber(x.value+y.value)
    Base.:-(x::YourNumber, y::YourNumber) = YourNumber(x.value+y.value)
    Base.widen(::Type{YourNumber})=YourNumber
end

Now I somehow collect mixed numbers into an array (results from X-calculations and Y-calculations).

A = Any[];
for i=1:10 push!(A, MyPackage.MyNumber(i)) end;
for i=11:20 push!(A,YourPackage.YourNumber(i)) end;

This is unbelievably toxic on 0.7:

hash(A)
#StackOverflowError
A[1]==A[20]
#StackOverflowError
isequal(A[1],A[20])
#StackOverflowError

ChrisRackauckas · May 19, 2018, 1:04pm

No, then who wrote the YourNumber(5) + MyNumber(3)?

Oh hey look, this code knows both numbers, so whoever wrote that knows both. To find an actual counter example, you’d have to write a code that never has both MyPackage and YourPackage in the same scope… otherwise yes, you can write code that doesn’t magically always work but that’s not surprising or non-local.

foobar_lv2 · May 19, 2018, 1:19pm

The problem is that I used X, which has MyPackage as a dependency, as well as Y, which has YourPackage as a dependency. I never cared about MyPackage or YourPackage.

I mean, I’m not even trying to add X-outputs and Y-outputs. Already isequal and hash of mixed Any-typed arrays can crash: You cannot build an Any-Dict containing both X-outputs and Y-outputs.

Demanding from users that they scan all the packages they directly import for incompatibilities is imho reasonable. You use a package, you skim its documentation and source and start weeping if you spot incompatibilities, OK.

Demanding from users that they go up the entire dependency chain is imho unreasonable.

edit: Even worse: When building a mixed Any-Dict, I only crash on hash-collisions. This would be a nightmare to reproduce.

edit2: And what it boils down to is that the definition of MyNumber is bad. I am not complaining about the dispatch system in julia.

ChrisRackauckas · May 19, 2018, 1:26pm

This is still local. The fact that bad promotion code can exist is an entirely separate issue from type-piracy and not really relevant.

mbauman · May 19, 2018, 2:00pm

I do believe there’s an issue here, but it’s separate from the punning/meaning thread so I’ve split it out. Somewhat similar is this bug: https://github.com/JuliaLang/julia/issues/13193

anon94023334 · May 19, 2018, 2:51pm

This is a good point and one that drove me away from Go[1]. If it’s possible to avoid this sort of thing in Julia, we absolutely should.

[1] The issue there is that if any part of your data structure has a mutex, you shouldn’t make a copy of it – but how do you know when your nested data types go 5-6 levels deep? The answer is that you have to inspect the entire thing down to the primitives to ensure you’re not inadvertently including a mutex in a data structure you intend to pass by value.

StefanKarpinski · May 19, 2018, 3:48pm

Fundamentally this is not an automatically solvable problem: if there are two user-defined number types and no promotion rule for them exists, how can the system possibly know what to do? The only reasonable thing it can do is throw an error. So the only real problem here seems to be that people don’t like getting a stack overflow and would prefer a method error. It’s possible that this could be “fixed” somehow, but exchanging one error type for another doesn’t seem all that urgent.

tkoolen · May 19, 2018, 4:39pm

But the ‘innocent user’ who wants to use both MyPackage and YourPackage without any type of error now has to define methods that resolve the ambiguity/stack overflow, which entails defining methods for functions that are not defined in their code on types that are not defined in their code, i.e., type piracy. Maybe not an issue if the user is just writing a script, but a bigger problem if the user is a package author. I think Jarrett hit the nail on the head in his talk.

foobar_lv2 · May 19, 2018, 4:52pm

Optimally, we would not trigger any error at all. Comparison by object identity is a perfectly sensible default. Arithmetic of user-types appearing in hashing is a somewhat temporary clutch to enable fast range hashing and should never throw.

Overly general promotion rules, like in the example, are just bad.

But imho, the ultimate problem is that transitivity of isequal is very hard to maintain:

Suppose I have two number types, F1 and F2, both of which are some field extension of Rational. Now, a Rational can get promoted to both F1 and F2, and two x::F1 and y::F2 want to be equal if and only if both are actually Rational (and equal). No error must ever be thrown on comparison, we simply must return false if one of the two elements is not representable as a Rational.

Alas, the package authors for F1 and F2 do not know of each other’s existence. What is the preferred way of doing this?

StefanKarpinski · May 19, 2018, 5:58pm

What should YourNumber(5) + MyNumber(3) produce?

This is a reasonable default for general mutable values, but numbers should almost never be mutable (at least semantically, even if the implementation happens to be mutable, as in the case of BigInt) and should always compare by value rather than object identity. A number is identified by its value, it is not a container with identity into which values can be placed.

The difficulty of maintaining transitivity is precisely why doing so is valuable.

If you need to use them together, add some methods that allow them to work together. Automatic composability only goes so far, at some point a person needs to apply judgement and understand and decide how things ought to work. Otherwise you just get nonsense dictated by some automatic rule that does something arbitrary and wrong.

foobar_lv2 · May 19, 2018, 6:24pm

This one should produce an error. Collecting in containers, including Dicts, and hashing should not produce an error.

I am imagining that they are sitting far upstream the dependency tree of the poor user. If the user tries to add two of these guys, well, he asked for the error. But what about comparison?

Suppose we had two competing implementations of integer-extensions (e.g. complex numbers):

module A
struct F1 <: Number
stdpart::Int 
offpart::Int
end
Base.isequal(x::F1, y::Number) = (x.offpart == 0 && isequal(x.stdpart, y))
Base.isequal(y::Number, x::F1) = isequal(x,y)
Base.isequal(x::F1, y::F1) = (x===y)
end

module B
struct F1 <: Number
stdpart::Int 
offpart::Int
end
Base.isequal(x::F1, y::Number) = (x.offpart ==0 && isequal(x.stdpart, y))
Base.isequal(y::Number, x::F1) = isequal(x,y)
Base.isequal(x::F1, y::F1) = (x===y)
end

This works nicely in isolation:

julia> isequal(A.F1(2,3),2)
false

julia> isequal(A.F1(2,0),2)
true

However,

julia> isequal(A.F1(2,0),B.F1(2,0))
ERROR: MethodError: isequal(::A.F1, ::B.F1) is ambiguous. Candidates:

So, how can the authors of packages A and B write their code, without knowing of each other’s existence, such that their types play along for isequal?

Either of the ambiguous methods would give the correct result. But we must avoid the error.

StefanKarpinski · May 19, 2018, 7:02pm

I find the situation of using two unrelated implementations of field extensions that don’t know about each other a bit contrived. Is this a hypothetical problem or one that you actually have? The obvious solution is to pick one implantation and use it everywhere instead of using two different implementations of field extensions. If you can’t for legacy reasons or whatever and are forced to use both implementations in one system and want them to interoperate better, then define the appropriate conversions between their types and add a promotion rule that picks whichever implementation you prefer when combining them.

In your isequal example, the problem is how you define isequal. Defining f(::Specific, ::Generic) and f(::Generic, ::Specific) is always a recipe for ambiguities—it’s the canonical one. This is just a more complicated version of that. Instead of defining the comparisons in both directions, you should instead define isequal on a pair of values of your type and let the generic fallback for isequal on numbers do promotion. Like this:

module A
    struct F1 <: Number
        stdpart::Int 
        offpart::Int
    end
end

module B
    struct F1 <: Number
        stdpart::Int 
        offpart::Int
    end
end

Base.convert(::Type{A.F1}, x::B.F1) = A.F1(x.stdpart, x.offpart)
Base.convert(::Type{B.F1}, x::A.F1) = B.F1(x.stdpart, x.offpart)
Base.promote_rule(::Type{A.F1}, ::Type{B.F1}) = A.F1

julia> x = A.F1(1,2)
A.F1(1, 2)

julia> y = B.F1(1,2)
B.F1(1, 2)

julia> z = B.F1(2,3)
B.F1(2, 3)

julia> v = Any[x, y, z]
3-element Array{Any,1}:
 A.F1(1, 2)
 B.F1(1, 2)
 B.F1(2, 3)

julia> v .== v.'
3×3 BitArray{2}:
  true   true  false
  true   true  false
 false  false   true

foobar_lv2 · May 19, 2018, 7:20pm

Ah, but in the example A.F1(1,2) and B.F1(1,2) are not supposed to be equal. Integers are supposed to promote to either A.F1 or B.F1, and e.g.

module A
    struct F1 <: Number
        stdpart::Int 
        offpart::Int
    end
Base.:*(x::F1,y::F1) = F1(x.stdpart+y.stdpart + 3*x.offpart*y.offpart, x.stdpart*y.offpart + x.offpart*y.stdpart)
end
module A
    struct F1 <: Number
        stdpart::Int 
        offpart::Int
    end
Base.:*(x::F1,y::F1) = F1(x.stdpart+y.stdpart + 5*x.offpart*y.offpart, x.stdpart*y.offpart + x.offpart*y.stdpart)
end

So A.F1 corresponds to the ring of integers adjoint square root 3, and B.F1 corresponds to the ring of integers adjoint square root 5.

And no, this is not a problem that I hope to run into soon, but Dict{Any,T} failing due to bad isequal in dependencies I pulled in without knowing is a fear of mine, so I’d be interested in what the right thing to do is. I imagine that competing algebra systems would have such problems. Logging stuff into Any-Dicts should just work, imho, which makes this so dangerous and hard to debug (triggers only on hash collisions).

Also, see the other thread with Base.Dates.

StefanKarpinski · May 19, 2018, 7:22pm

If you want to correctly compare numbers in different field extensions, you need to promote them to a common field, which is doable but yet, more complicated. It is also not something that anyone can reasonably expect Julia to deduce and do for you automatically.

foobar_lv2 · May 19, 2018, 7:28pm

The desired comparison is just the one I wrote, with the disambiguation

Base.isequal(x::A.F1, y::B.F1) = (x.offpart ==0 && isequal(x.stdpart, y))
Base.isequal(x::B.F1, y::A.F1) = isequal(y,x)

The only question is whether the authors of A and B can pull this off without knowing each other: Can they specify that, in case of ambiguity, julia should just choose something instead of complaining?

StefanKarpinski · May 19, 2018, 7:40pm

No, they cannot “pull this off” without having some prior arrangement. There’s no way for the language to know what these types mean or how they should behave. Rational values have a standard hashing mechanism in Base which they can independently implement and get correct comparison and hashing without knowing about each other. Base does not provide the same for field extensions. Someone else could, however, provide a similar mechanism for field extensions which would allow independent implementations of field extensions to interact correctly. But there’s no way to avoid having at least this level of coordination. Fortunately, multiple dispatch allows the user of these libraries to provide such a coordination mechanism externally without needing to convince either package to use it.

The concern about isequal throwing an error here is valid although I’m not sure how to avoid the problem other than putting a try-catch around the call to == in the isequal fallback, which is a pretty dirty solution.

foobar_lv2 · May 19, 2018, 9:37pm

In some sense both authors already did the hard work in the hashing code (just like you did for Rational that is a superset of Int):

One checks whether the value can exactly converted into a more specific Base type. If yes, both hashing and comparison must be special-cased in order to preserve transitivity. If not, then we can return false as a fallback without breaking transitivity.

Of course julia has no way of knowing which isequal variant to call; my dream would be

@whatever Base.isequal(x::F1, y::Number) = (x.offpart ==0 && isequal(x.stdpart, y))

which tells dispatch that in case of ambiguity between @whatever methods, it should take e.g. the first in import-order (or the second or throw a dice). The @whatever macro would either need a supporting flag in the method table or would need to analyze the method table at this point in time (load-order dependent). I don’t know whether the latter can be done at all.

For all such new number types, the isequal will then be correct regardless of which disambiguations have been chosen.

For an easy example that this is not entirely academic:

using Nemo
isequal(FlintZZ(1), 1) #true
isequal(1.0, FlintZZ(1)) #true
isequal(1,Complex(1)) #true 
isequal(Complex(1), FlintZZ(1)) #false


using DecFP
isequal(1,BigInt(1))#true
isequal(1,Dec128(1))#true
isequal(BigInt(1), Dec128(1))#false
isequal(Dec128(1), FlintZZ(1))
ERROR: MethodError: no method matching (::Nemo.FlintIntegerRing)(::DecFP.Dec128)

I am convinced that transitivity-failure is currently the norm in the julia ecosystem (for number types outside Base) and not the exception or a rare bug. Moreover, I don’t see any way of fixing cross-package transitivity without making DecFP and Nemo depend on each other (but I would be grateful to be corrected if a way exists).

jrevels · May 20, 2018, 4:12pm

I also agree that upstream multiple dispatch ambiguities can result in errors that are horrible for end-users, and I’d say that it’s not nearly as easy to resolve these problems in the real world as it is with toy examples. Sure, all code involved can technically be “discovered” programmatically, and is actually “there”, but come on - it’s ridiculous to expect a naive end-user to be able to debug a MethodError across multiple packages where the package authors themselves didn’t even foresee the problem.

For example, let’s say our user wants to solve some optimization problem, where PkgA provides some kernel they’d like to use to define the objective, and PkgB provides the solver:

using PkgA
using PkgB

f(x, y) = PkgA.g(x, y) * y

PkgB.optimize(f, x, y; use_ad = true)

Code like the above can easily throw ambiguity-related errors when PkgA.g employs an upstream package that happens to be ambiguous with the AD package used by PkgB. This gets even worse when - as @tkoolen pointed out - such code is actually in a package that then other people/packages depend on.

This specific conversation has been mostly about Number types, for which there is a promotion mechanism and it is often possible to assume efficient convertibility between subtypes. However, this is much more of a nasty problem to deal with when you don’t even have a promotion mechanism to fall back on, e.g. you’re not working with types that are not efficiently convertible to one another. In this case, most hacks around it end up being nasty and load-order dependent, as was mentioned previously.

The good news is that I think we actually have decent ideas for solutions to this problem at this point. As mentioned earlier, Cassette can solve this problem by enabling package authors to specify the “context” in which their code is running, and have that “context” always take precedence.

For example (A and B are provided by different package authors that don’t know about each other):

f(A(x), B(y))

can be explicitly resolved in order of context application, e.g.

PkgB.doBthing((x, y) -> PkgA.doAthing(f, A(x), y), x, B(y))

would presumably result in (using psuedocode for notation):

BCtx(ACtx(f))(A(x), B(y))

This works even if the caller is calling these packages indirectly, like in our optimization example above.

Of course, whether or not Cassette actually makes sense for either A or B is dependent on what you were trying to achieve with your A() or your B() in the first place; but generally, only one of them needs to be “cassette-ified” in order for this particular case (binary ambiguities) to be considered resolved.

There’s also a less general, but non-Cassette solution in the form of https://github.com/peterahrens/Hyperspecialize.jl.