Diamond dependency: why allow 2 packages with the same name but not 2 versions of same package?

I’m tidying up some old notes. I don’t run into this in practice, but two packages with the same name can exist in the same dependency graph if they don’t directly conflict. For a simpler example and using : to indicate direct dependency, (A: (B), (C: B)) can have 2 different packages named B, but (A: (B), (B)) cannot work because import B must be unambiguous for A’s environment. A.B and A.C.B being different modules isn’t a problem, but A.B and A.B should be the same.

Putting this together with my notes on the package manager finding version conflicts, I realize I don’t know why the same thing can’t be applied to 2 versions of the same package, that is the 2 B in the previous example are actually the same package with different versions, say B@1.8.3 and B@2.0.0. I know that restricting a project to 1 version of each package is important for internal code compatibility and ease of vital updates (especially security), but I can’t put down an answer for why exceptions are not allowed. Internal code compatibility can be a problem for 2 packages with the same name too if their source code is also very similar; we see multiple-include posts complaining about methods failing to work for types of the same name from the wrong module of the same name all the time.

Looking up the situation in other languages, it seems that while it’s just a problem to avoid sometimes, Javascript’s npm and Rust’s cargo allow different versions of a dependency to coexist (though one blog says npm doesn’t know not to store multiple copies of the same version each time it shows up). What were the crucial differences between 2 packages with the same name and 2 versions of the same package?

7 Likes

The system under the hood uses UUID to identify packages. So names are mapped to UUIDs within a Project.toml

In theory it might be possible to extend this and take version numbers into account.
The core issue is one that is already possible today:

module A
  function myfun end
end
module B
   using A
   function A.myfun(::Int)
       0 
   end
end
module C
   using A
   function A.myfun(::Int)
       1 
   end
end

If a user loads C and B which version of A.myfun will be reachable?
(This is technically type-piracy since neither B nor C own myfun or Int).

If we load multiple versions of the same package, due to the fact that the method-table is global, collisions like these might be highly likely.

3 Likes
module A
    struct Foo
    end
end

module B
    using A: Foo
    function f()
        return Foo()
    end
end

module C
    using A: Foo
    function g(x::Foo)
        # Do some stuff with x
        return 123
    end

What should C.g(B.f()) do? If B and C are using two different versions of A, then B.A.Foo might not be the same as C.A.Foo. E.g. the two different Foos might have different fields, etc.

So a A.Foo coming out of B.f() might not be compatible with the A.Foo that C.g(::Foo) expects.

2 Likes

Good examples, but I think 2 different modules with the same name would represent such packages better:

module A
  module B # v2.0.0
    struct X end
  end
  module C
    module B # v1.8.3
      struct X x::Int end
    end
    using .B: X
    foo(::X) = true
  end
  using .B: X
  using .C: foo
end

Same problem applies to the example I just made: A.foo(A.X()) and A.X(1) error because A.B.X is not the same as A.C.B.X. That’s what I meant by “internal code compatibility”. From a couple different articles, the Rust answer seems to be “design explicit conversions or don’t mix them at all.” My intuition is that this is disallowed because it seems like such a terrible thing to maintain, and the difference is that 2 versions share a development history and will heavily overlap in names, but 2 packages likely won’t. But I want to hear more reasoning before I consider my notes on this finished, especially if there is a paper trail of the decision.

For some special cases it’s not impossible to make it work, but in general, it’s a huge pain. The ultimate, final boss challenge is cross-architecture ABI compabitility, some challenges of which are explored here.

Amusingly, Microsoft are VERY adept at providing this sort of compability!

1 Like

The rare blog with the appropriate amount of profanity. Am I correct to think we don’t have to worry about ABI compatibility, even in a hypothetical situation with multiple versions of a Julia package, unless we’re trying to wrap binaries because we normally just precompile and compile Julia source for each machine?

Depends on who you include in that “we” :person_shrugging: I’m certainly thinking about how to have a shared codebase between a microcontroller and a desktop machine, where the difference in word sizes/padding can be an issue. The workaround is to not use Int and instead explicitly specify the desired widths by using Int8 etc., though that of course doesn’t help with padding mismatches. For that, an explicit serialization step is still necessary, which has to somehow preserve that information & make it consistent across architectures. I’ll definitely explore that direction more once juliac lands!

That’s the thing - it’s not just a “call into some other code” problem, it can happen with the same compiler & same version of the language, just running on a different machine/architecture. You could encounter the same problems with x86 vs x86_64. In fact, we already do with serialize:

In some cases, the word size (32- or 64-bit) of the reading and writing machines must match. In rarer cases the OS or architecture must also match, for example when using packages that contain platform-dependent code.

Other than that though, yes, the places this can currently show up in is when calling out to other libraries from other compilers. Maybe we’ll encounter this ABI friction with libraries compiled with juliac too - it’s too early to say.

1 Like

I never knew that was baked into the parser but I suppose that’s how a literal would work. I kinda assumed Int was just some early platform-conditional alias somewhere.

So this would be technically possible. However it seems like a bad idea to allow multiple copies of the same package in Julia. It’s very possible for an object of a type defined by one copy of a package to end up passed to a method of another copy of the same package, at which point :boom: it breaks in the most confusing way possible. This isn’t really a problem in JavaScript which doesn’t have nominal typing or type-based dispatch. Not sure how multiple copies of the same package works in Rust, which does have nominal typing.

7 Likes

One thing I’ve considered is allowing a package to declare some dependencies to be “internal” which basically would mean “no types defined by this dependency are ever exposed to my callers”. You would be able to have multiple versions of internal dependencies. But this is a pretty subtle thing to reason about and also hard to check.

9 Likes

I would probably never use such a feature. Avoiding doing the work of updating and recompiling to make the one version work everywhere doesn’t imply there isn’t work. From what I’ve read, while it’s acceptable to get a working product, Rust users acknowledge it’s ideal to not compile a lot of duplicate isolated code and it takes work to avoid interactions between different versions. In Julia and other languages where users want a lot of interactivity and composability, that isolation seems infeasible.

I can only mark one response as a solution but every commenter has contributed something to my notes. Anyone else can feel free to add to this but I’m wrapping up the thread for now.

3 Likes

Another thing to consider is that Julia’s dynamic nature makes it very easy to “program around” API changes—you can easily do things like check if a name or method exists and work around it if it doesn’t. That kind of thing is hard/impossible in more static languages, which makes compatibility much more rigid.

1 Like