General environment creating issues for local ones

Recently I was trying to work with the latest version of MLJBase, for this I created a sandbox environment sbx which I activated and added MLJBase in it, this is with Julia 1.7.1.

Trying to use the package, I got errors about StatsFuns that were also reported elsewhere in other settings (e.g.: Unable to precompile DifferentialEquations.jl in Juno IDE in new environment and empty project ; also https://github.com/JuliaStats/Lasso.jl/issues/65)

Here’s how it looked

(sbx) pkg> status
      Status `~/Desktop/sbx/Project.toml`
  [a7f614a8] MLJBase v0.19.3

julia> using MLJBase
[ Info: Precompiling MLJBase [a7f614a8-145f-11e9-1d2a-a57a1082229d]
ERROR: LoadError: MethodError: no method matching names(::Base.Broadcast.BroadcastFunction{Irrational{:log4π}})
Closest candidates are:
  names(::Module; all, imported) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/reflection.jl:102
Stacktrace:
 [1] top-level scope
(...)
in expression starting at /Users/tlienart/.julia/packages/StatsFuns/UzRhT/src/StatsFuns.jl:3
ERROR: LoadError: Failed to precompile StatsFuns [4c63d2b9-4356-54db-8cca-17b64c39e42c] to /Users/tlienart/.julia/compiled/v1.7/StatsFuns/jl_s26VHd.
Stacktrace:
(...)

My main environment had an old version of MLJBase, which I removed; this fixed the issue and I was able to do using MLJBase in the sandbox environment afterwards.

I don’t understand why the status of my general environment caused problem with a separate environment? shouldn’t the two be separate and install their own dependencies? this is probably infrequent but is likely to be fairly confusing to others as well imo.

shouldn’t the two be separated and install their own dependencies? this is probably infrequent but is likely to be fairly confusing to others as well imo.

This is probably Conflicting environments are handled incorrectly · Issue #35663 · JuliaLang/julia · GitHub.

1 Like

Seems like it, thanks Rik

Just a note because it’s unclear to me whether that matters or not from the issue, in my case I started from a fresh Julia session with nothing loaded, activated sbx and then tried using MLJBase which failed. I’ve not read all the comments from the issue you posted but it seems that some of them discuss switching environments within a session that already has active stuff in it (which I can understand to be causing issues).

1 Like

Ouch you just pressed my rant button! These nested environments are probably the Julia feature I dislike most. I mean, the documentation says:

There are a couple of noteworthy features of this design:
[…]
2. Packages in non-primary environments can end up using incompatible versions of their dependencies even if their own environments are entirely compatible. This can happen when one of their dependencies is shadowed by a version in an earlier environment in the stack (either by graph or path, or both).

To me at least that’s not a “feature”, that’s a bug. I find it crazy that this is done implicitely and by default. The documentation continues with

Since the primary environment is typically the environment of a project you’re working on, while environments later in the stack contain additional tools, this is the right trade-off: it’s better to break your development tools but keep the project working.

No no no. Please stop the madness. I already have enough problems with legitimate bugs in my code! I don’t want undefined behavior as part of the design just because it makes some things convenient!

Also there’s nothing in the language to enforce the idea that the parent environments are just for developer tools. My guess is there are lots of projects out there that use nested enviroments for sub-projects.

Maybe worst of all: we like to say “Julia projects are reproducible because the dependencies are recorded in the Project.toml and Manifest.toml” but because of this feature it’s not true. The language should guarantee this by default at least but it doesn’t.

I wish there were no implicit nesting of environment. Inheritance from a parent should be explicit. Here are two ideas:

  • Using Subproject.toml instead of Project.toml could activate inheritance from the closest parent.
  • Project files could support an include directive.

I prefer the first idea but in any case, incompatible dependencies should raise an error. Sounds like something that shouldn’t need to be said.

/rant

2 Likes

:+1:

But if you have correctly defined your projects dependencies in the Project.toml and Manifest.toml it should be fine since when activating and instantiating that project it will be your primary environment and thus guaranteed (according to the page you linked) to have the correct dependencies.

The full dependency graph of the first environment in a stack is guaranteed to be included intact in the stacked environment including the same versions of all dependencies.

I find it quite nice to be able to have BenchmarkTools/UnicodePlots/Revise and similar packages in a secondary environment so they are always available but I don’t have to add them to every single package I work on. And I think it is a reasonable step to say that the primary environment should be the one that is guaranteed to be correct, and later in the stack it might use wrong versions then. If you want to be sure you add everything to your current environment, but if you want the convenience you can use stacked envs and it seems to work most of the time (I have never had a problem).

With that said I think it is reasonable that incompatible dependencies should raise an error as you say, better to have error than strange behavior.

2 Likes

But if you have correctly defined your projects dependencies in the Project.toml and Manifest.toml it should be fine

If I do this it should be fine… Development tools should be reliable, and this is not.

For example, maybe I intend to add all my dependencies in the project’s TOML. Now let’s say I start an independent REPL to test a function in the latest release of DataFrames. If I forget to do activate --temp, it will be installed in the default environment. Then in any project if I use DataFrames without add DataFrames it will work, using the version in the default environment. Oops! Now my project’s dependencies are note tracked properly. And I’m using a package that can use incompatible versions of its dependencies.

Then what about the sub-projects? It sounds like a useful feature, I guess many people use them without realizing that dependency management is broken for this use case (unless you neatly put everything in packages, but that shouldn’t be necessary for correctness).

And even for devtools: are you really OK running code from tools with incompatible dependencies? I don’t want that! It means running code that might not do what it’s supposed to do, then all bets are off.

And I think it is a reasonable step to say that the primary environment should be the one that is guaranteed to be correct

To me even asking “which environment should be guaranteed correct?” is wrong. They should all be correct.

If you want to be sure you add everything to your current environment, but if you want the convenience you can use stacked envs

Let me paraphrase: “If you want correctness, do this. If you want convenience, you have it”. That’s the wrong default, it should be the other way round.

What is not realiable here? If you correctly create you environments, they can reliable be reproduced. If you are not running them with Manifest.toml you are only guaranteed your [compat] section in Project.toml, or if you are not running them in the top level of your environment stack you don’t have specifik version guarantees either. But that is documented, that is not how you should use them if you want to recreate them reliably.

I said I thought it would be reasonable that this should error in case of version mismatch, but I think a lot of the time it will be no problem.

But if this is not possible? If you have package A in your default environment, which then depends on an old version of package B, and in the current environment you have added a new version of package B. What should happen when you first use B and then use A? I don’t see a good way to get everything you are asking for.

To me there are a few options:

  • Don’t allow stacked envs since it is pretty impossible to guarantee everything will always work well. Now you have to add dev tools to every local env you work on.
  • Consistent stacked envs, where each time something is installed in an environment, it also has to make sure all versions are compatible with the parent environments. This could maybe be nice, but I also see downsides with it since the default env may then negatively affect the versions you can use for you package.
  • The current strategy, allow mismatch in versions and prioritize the current environment. This will allow for a current environment that is not constrained by your dev tools, while still allowing you to keep the devtools out of the project environment. There might be version problems occasionally, and this is not nice.
  • As discussed above, the current strategy with the addition that any version mismatch will generate an error. I think this would be a good balance. You can trust that the code that runs is using correct versions, and you have the convenience of stacked envs.

I disagree, “if you want correctness you do it one way, if you want convenience you do it another way” is more what I feel I said. You can have the correctness by adding everything to the current env. Then you will be sure that the version loaded is from the current env and that it is compatible with all the other versions in the current env (as far as one can be certain of that in the current ecosystem, which is probably a bigger problem than what we discuss here).

why is this an issue though? at the moment of adding what you need in the local env, Julia can figure out that you already have a compatible version and just use that so that the add ABC has basically no overhead. Maybe we could have a default_env_setup.jl similar to startup.jl which would add the packages you think you’d need in any new env :thinking: (Revise, BenchmarkTools etc)

I guess my mental model was that environments was a bit like python virtual env, a “separate independent box”, I imagine I’m not alone in this assumption

But in my example I stated that the versions were not compatible, a scenario that could happen. Or am I misunderstanding you?

I mean, that is pretty much how I also see them. But since stacked environments are allowed you sometimes have two boxes that should be independent, but they overlap with some few dependencies and thus there is the possibility that there can be versioning problems within that overlap. As long as you only use packages from a single environment you should not have any problem, then you only have the single box in scope.

I should also mention that I might sound more confident than I should on this topic, I’m presenting my understanding of environments based mostly on dealing with them and getting them to work for me.

Let’s try to use symbols to make this clearer: call the main environment ME (what you get when you start Julia doing nothing special), and let’s call a specific environment SE (that you activate explicitly). The issue I had was the following:

Let’s assume there’s a package ABC currently at v0.2 which itself depends on some other package DEF. Now let’s say that ME has package ABC v0.1.If you start a fresh Julia session (so in ME), and do using ABC → get ABCv0.1 everything will be fine.

The problem is if you start a new Julia session, activate SE, add ABC (which will then install 0.2 because in SE there’s no restriction), but then if you do using ABC it might fail because in ME the version of DEF is possibly incompatible.

This is what happened in OP, and I find this very confusing.

My thinking is that when activating SE in a fresh Julia session, nothing should interfere with that environment. And if you add new packages in it, it should figure out the latest needed dependencies of those packages. If those are already installed in ME, great, reuse that, otherwise install another version. This is not what happens and I don’t understand why.

The tangent about making stuff from ME available to SE by default (thinks like BenchmarkTools etc) is a bit of a detail that could be solved with a script similar to startup.jl.

1 Like

Ahh, that is my bad. I forgot about OP and had @sijo’s post in mind when answering since that is what I responded to initially.

I agree that the behaviour in OP seems wrong. I tried to recreate it, didn’t know the version of MLJBase you had in the main environment so I picked 0.10.1 as an arbitrary test, but it seems to work as expected for me (i.e. the new environment added the latest MLJBase and I could do using without any problem).

Do you have anything in your startup.jl? This could potentially mess with the environment even if you are not importing MLJBase since you might import a package that has same dependencies. The deps are then held back by MLJBase and are loaded from the main environment before you change project to your specific environment.

I wouldn’t say this is exactly the same as what was brought up in the issue suggested by @rikh, though maybe it could fit there. If you can create an MWE it probably should be reported, either there or as its own issue.

No I have nothing else than Revise in my startup.jl.

MWE for these kind of things are hard to build because you need to screw up your dependencies in a specific way, the issue is that the main environment had an outdated dependency of MLJBase , not just an outdated MLJBase.

With an arrow to mean “depend on”, let’s say A → B

Main Environment (fresh session):

  • A v0.1 → B v0.1 (if you do status , B does not appear because it’s a dependency not an explicit env)

Local Environment (fresh session):

  • A v0.2 (-> B v0.2) → fails because it will try to use B v0.1 from the main environment and that will throw errors.

Note: errors are thrown upon using A not when add A (the latter would be annoying but acceptable).

What I would expect is that when adding A v0.2 Julia figures out that it needs B v0.2; it may see that there is B v0.1 in the main environment which is not the latest version and so pulls and installs B v0.2 instead so that it can be accessed from the local env. This is apparently not what happens.

Anyway I think @tim.holy reported some flavour of this (with and without fresh sessions) in a number of places including with ArrayInterface which apparently had a bunch of issues with it so we can probably hope this kind of stuff will go away.

Okay, this is what I can’t reproduce without somehow importing B in the main env before switching.

From tims examples in the linked issue he explicitly loads ColorTypes in Proj1 before switching to Proj2.

But if you don’t load B in any way before switching it seems odd that the new environment would use that.

Are you sure? I just tried to install a package B at an old version in my main env, and the activating temp env and installing A that depends on B seems to install B at the newest version.

My understanding was that it was not the version of B installed in the specific env that was the problem, but that the version of B from the main env was loaded in the julia session so when the A was imported in the specific env and it wanted B, julia says that B is already here even though it is the wrong version.

I tried looking at tims ArrayInterface issues, but felt like they didn’t really give enough information to understand exactly what was going on (at least for me). But as you say it will maybe be fixed since similar issues has been popping up which can hopefully be fixed together :slight_smile: