Speculations about the default environment (or a new draft environment)

I will add one additional feature which, that one, would actually guide the users to the use of environments, in a nicer way: saving the current temporary environment and the history to a file. Such as:

julia> using Draft
  Activating new project at `/tmp/jl_AUMyBc`

julia> @reuse StaticArrays Plots

julia> plot(Tuple.(rand(SVector{2,Float64}) for _ in 1:100))

julia> savefig("test.pdf")

julia> save_project("./my_project")

Now, this save command would create:

  1. the Project.toml file in the my_project directory.
  2. the Manifest.toml file there.
  3. my_project/MyProject.jl, containing the history of the REPL:
using StaticArrays
using Plots
plot(Tuple.(rand(SVector{2,Float64}) for _ in 1:100))
savefig("test.pdf")

(the better this file reflected the last state of the REPL, the better, although that could be hard, but anyway the user can then edit that file and move forward, now, in a reproducible state, created from this exploratory REPL code, if he/she decided that to be interesting).

3 Likes

Saving the current REPL session would for sure be a nice feature.
I think only IJulia has this option, but I could be wrong.
The big repl_history.jl file is also not that useful for this, I fear.
Is there a good way to track the commands entered in a specific REPL session?
Maybe a pre/post execution hook to also reset the recording when the environment is changed?

2 Likes

When the Draft package gets registered it would be easier to explore if it provides a convenient workflow…

1 Like

This sounds a lot like PlutoPkg.

1 Like

You can of course install it from the repo directly. I am already finding it a handy tool.

The default Julia REPL should be a UnicodePluto :grinning:

(Even if a joke, that sounds quite interesting as a Julia IDE)

1 Like

Yes, I very strongly agree with that! I have observed literally dozens and dozens of Julia users here in an academic setting going from beginner and pretty knowledgeable users, and this trips up almost all of them. I mean, not 70%, I mean more like 99% :wink: And I am only exaggerating a very, very tiny bit here.

I think there is also a very easy and simple fix for this: have one shared project (in the Pkg.jl sense, i.e. simply a project that lives in .julia/environments) named v1.7global that is always on in the stack of available env (i.e. essentially what currently is v1.7). And then make a change that there is always a project active in any Julia session, and if the user doesn’t specify anything activate the shared env v1.7. Done, I think that would resolve a huge amount of confusion.

The main benefit is that novice users would normally add stuff to the v1.7 project, and if they then activate a different project, that stuff would no longer hang around. If someone wants to make a package always available, then they need to go through extra hoops, which would be much better because that is the much less common scenario. For now they could just activate the v1.7 project, add a dep, and be done, but one could also use syntax like ]add MyPackage --project=@v1.7shared or something like that.

11 Likes

This does not solve the dependency nightmare of the cluttered shared environment. Putting users in a temporary environment solves that and this other confusion.

@giordano/Link: GitHub - StefanKarpinski/Nefarious.jl: all your base are belong to me

Ah. Well that explains why Julia doesn’t already launch with julia --project=@. option by default!.

A small tweak/security enhancement

What if a message popped-up for every new directory-based-environment that was run? Something like:

You have requested to start a new Julia session using the environment at
/path/to/current/working/directory

:warning: This does not correspond to any directory on your list of trusted Julia environments.

How would you like to proceed?
— [T]rust: Add to my list of trusted environments, and proceed to the Julia REPL
— [P]roceed to the Julia REPL using this environment only for this session.
— [A]bort (default).

List of trusted environments

  • Could possibly be stored in a file like ~/.julia/trusted
  • Possibly requiring some form of password encryption to add to the list of trusted paths?
  • (Hoping this encryption doesn’t become easily decipherable given the likelyhood of having similar repeating patterns… I’m no expert on encryption.)
2 Likes

Sorry. I don’t understand. The way you explain it sounds exactly like how .julia/environments/v1.7 already works… Just that you want there to be a second one called v1.7global for some reason.

I definitely understand observing a large percentage of Julia users navigating environments with an ad-hoc approach. I have experienced that first-hand myself (with my own semi-blind trial-and-error approach). I simply I don’t understand how this particular proposal helps. I must be getting snagged somewhere in the explanation.

1 Like

agree, but why have a shared global one all to fulfill this? Vscode would handle it seamlessly already if you always have a project file available when opening with Julia --project. Want to share project files?Put in common folder, or at least have a common parent. Come back to run some code six months later and no problems at all. Only stack development tools.

Exactly. The issue is there is no way to have a shared project file with a non-trivial number of dependencies and not have it constantly break conda-style.

1 Like

It does not. But it’s a rather simple change with a lot of positive effect. I think switching to your proposed temporary environment will be more complicated to implement and also expect it to face more social resistance.

(Personally I don’t mind a cluttered global environment so much. It’s rather simple to understand. And if it becomes a problem you can learn about temporary environments or Draft.jl or whatever other nice things we come up with. The conflation of a global default and global shared environment is much more unfortunate and unexpected IMO.)

5 Likes

The difference is that currently v1.7 is always available, even if you activate some other project. With my proposal, v1.7 would no longer be active if you activate some other project.

Yes, I totally agree. There is absolutely nothing surprising about that part and it is also something that is not bothering me at all. But this business of stacked environments, and that by default most users add stuff to a project that is always active just is something that almost all newish users don’t understand and is causing a lot of grief. In particular because it really torpedos the workflow of having a separate project in a local folder. I can’t even tell you how many times someone sends me a folder with a project and some code, and they think everything is self-contained, but then it turns out it isn’t because they are using some package that is in their v1.7 project, but not in their local project, and they never notice that because things of course run just fine on their machine.

10 Likes

Maybe the packages that are used by environment stacking can be shown transparently in Pkg.status(). This could help in some of those cases.

(jl_m7Mksj) [offline] pkg> status
      Status `C:\Temp\jl_m7Mksj\Project.toml`
  [34234223] 10xStdlibs (from julia 1.7.1)
  [91a5bcdd] Plots v1.27.4 (from environment @v1.7)
  [1986cc42] Unitful v1.11.0
8 Likes

Would the post-receive or pre-receive hooks work for this?

No, from what I read those are serverside and trigger on push.

A different solution to this would be to make loading from the load path disallowed when running a script in an active project. That introduces a deviation between the REPL and the script, though, which might not be ideal. Deactivating @v#.# when there’s an active project is an interesting approach. It doesn’t entirely disallow making a script that’s not self-contained since what you’re calling @v#.#-global would still be accessible from a script. You’re really relying on the assumption that only people who know what they’re doing would put anything in @v#.#-global at all and that they would presumably avoid depending on things in there from their scripts. If you really want to enforce that scripts in active projects are standalone, then you can’t allow even that.

I understand where you’re coming from here, but I’ll try to explain why this proposed approach is not a good one.

When you say “the most recent version available (without fetching any info from the web!)” you’re glossing over a lot of complexity. Picking a compatible, working set of version is one of the hardest and most important things that a package manager does for you and in this scheme it cannot do that. Taken at face value your proposal means that we chose literally the most recent version that is already installed. Which means completely ignoring compatibility with packages that are already loaded. This means that people would constantly end up with broken combinations of package versions that are known not to work together. That would be a very bad user experience.

Ok, so what if we try to avoid that? We could try to pick the most recent version of Package that is not incompatible with any package versions that we’ve already loaded. That might work a bit better, but it amounts to doing package version resolution—a notoriously hard problem—on the fly with a greedy algorithm that cannot backtrack, since you cannot unload an already loaded package version. That’s very likely to frequently back users into corners where they want to load some package but there is no version that’s possible.

There’s also the issue that it’s not great to put so much complex logic on the critical path of package loading. Julia already does quite a lot here by parsing the manifest and using the contents of the manifest to find the right versions of packages to load. But it was carefully designed to be fairly minimal and straightforward and not depend too much on the contents of the rest of your hard drive. The steps to load a package are:

  1. Parse the active manifest
  2. Find the stanza for the current module
  3. Look up the UUID of Package in that stanza
  4. Find the stanza for Package’s UUID
  5. Look up the version in that stanza
  6. Find the fixed version path in a depot.

By step five you have the name, UUID and content-hash (or explicit path for dev’d packages), so you have the exact relative path inside of a depot that the right package version will be found at and you just have to look in each depot at that known location. The total work required to load a package is parsing one TOML file, doing some reasoning based on its content, and then looking for a fixed path in each depot (and it’s usually in the first one).

In what you’re proposing, on the other hand, the loading code would have to look through every depot for every installed version of Package and parse the project TOML file in each one to extract a version number. It would then sort the version numbers and pick the latest one. If we want to not load known incompatible versions of packages, it’s worse: then we need to load all registries and parse the Compat TOML files for Package and for every package that we’ve loaded so far to make sure that they are compatible. That’s a completely over the top amount of complex work to do at package load time and it depends on all sorts of state all over your system.

One of the hard design constraints that Pkg had to satisfy was that package loading has to be fairly simple. This is in conflict with the fact that picking a compatible set of package versions is very complex. The solution Pkg uses is to separate manifest generation from package loading: manifest generation is done by Pkg and can be as complex as it needs to be; once a manifest is generated, however, loading packages based on it is straightforward. Picking package versions at load time, as you’re proposing, throws that clean separation out the window and forces us to solve a really complex problem (version resolution) at a time when we really don’t want to be solving that kind of problem (during package loading).

Stepping back, I think that the objection to a persistent global environment is quite poorly motivated. The entire objection appears to be that it can get “bloated”. But why does that matter? Why is bloat a problem? Are you concerned about disk space? Compatibility problems? What’s the concern? Presumably your concern isn’t disk space since your proposal doesn’t address disk space at all as it doesn’t lead to fewer package versions being installed. So the issue much be compatibility issues? But as pointed out above, it’s far worse at dealing with compatibility problems than what we currently have. If the latest versions of all packages you want to use happen to be compatible with each other than your proposal does work nicely; but in that case there’s also no problem with having them all in a global environment. So can you give clearer motivation for why having a persistent global environment is a problem?

[Note: I made some substantial edits a few minutes after posting to improve clarity]

10 Likes

Thanks for the feedback!

How is this different from what may happen when loading a package available in the @v1.7 shared environment when using another environment? Shouldn’t the possible issues be exactly the same? Sincere question, here, is there any difference? My current impression is that the package available on the shared environment is just loaded for good.

(reason which avoiding this is something suggested there as well, and cause of other usability issues that David is more concerned about).

Well, my experience is that:

  1. After a while, if there are too many packages there, installing or updating anything starts to become very slow. Some packages are simply impossible to keep available in a shared environment (CUDA, in particular), because having there implies that I may get an update of that at some time and have to wait several minutes to work on what I wanted.

  2. By default, installing anything there may trigger a cascade of updates of everything. Also very inconvenient if the purpose is to do some quick exploratory code.

  3. From time to time, one of these packages fail to keep the Compat entry up to date, and then it holds back the updating of other packages, finally leading to a compatibility roadblock. Finding which package is holding other back is not that straightforward.

These points make of the current default environment a inadequate place to write exploratory code, follow package tutorials, execute example scripts, etc. Thus, my current “exploratory” developing mode involves 1) starting a temporary environment; 2) using Pkg.offline() 3) add-ing the packages I need there; 4) using those packages and, finally, doing what I need to do.

That has worked very well, except for the steps needed, instead of just using the packages installed in the @v1.7 and coding, which was what I used to do before facing the problems with the bloated shared environment.

So, instead of having new Julia users (not hard core developers) facing those problems, it seemed to me that a default temporary environment could be something more safe and error free. With that they could follow tutorials, test examples, etc, without worrying about having any long-term consequences on their Julia installation.

Because my workflow of writing exploratory code in a temporary and offline environment has worked mostly issue free, I thought that having that as a more standard option (of course pushing it to be the standard is way out of my scope) could be nice.

In my dreams, having a

julia --draft

startup with more or less those properties (and with the possibility of saving the history and environment at some point!) would be a great usability feature.

But no more than that, I don’t think that Julia will vanish without that :slight_smile:

3 Likes