Speculations about the default environment (or a new draft environment)

no

C:\Users\lenovo>julia -q --project=$(mktemp -d)
ERROR: unknown option `-d`

C:\Users\lenovo>julia -q --project=$(mktemp)
($(mktemp)) pkg>

There is kind of a mechanism in Powershell

but after reading a couple of pages of Powershell documents one simply installs WSL

Although the simplest way is to install git-bash

lenovo@DESKTOP-T0CN7A8 MINGW64 /
$ julia -q --project=$(mktemp -d)
(tmp.0H0euRWAq2) pkg>

You know the world is ending when your recommendation is “use BASH”

1 Like

Is the syntax for command substitution even the same? Genuine question, I have no idea

Describe a shell to someone who has never used a shell before using Google translate English-Mandarin-English and that’s how Powershell was created

6 Likes

I still don’t really get this. It sounds like you should be installing CUDA packages or whatever’s causing problems in project-local environments. I’ve accumulated a bunch of stuff in me global environment occasionally and the solution is… to delete some stuff. Problem solved.

If installing something new triggers updates of existing things then that means it wasn’t possible to install the new thing without newer versions of the other packages, which suggests that you’d have to install those other new things no matter what. Unless I’m missing something, the fact that you’re in a global environment with other things doesn’t make this worse.

1 Like

We know! But that is not the workflow indicated by any package user guide…

Let me try to exemplify: I have package, the purpose of which is to do some analysis of molecular dynamics simulations. The users of that package are not expected to know Julia. They will just follow a tutorial, changing the trajectory and configuration files by their own. Thus, the workflow consists of 1) install my package. 2) install Plots and some other stuff. 3) follow the tutorial.

All very nice, no issues there. But as this user starts to use other packages, and each of these suggests a similar workflow, soon enough he/she will have a bloated default environment and errors start to happen.

Then this person comes to the forum (maybe) and gets the answer: “You shouldn’t be installing all this stuff in your global environment!”

And that is the first time in his/her life he/her heard of environments. Cleaning things up is not that simple (which packages to clean? which packages are installed? which are dependencies of others?), triggering a frustrating need for: 1) delete everything and reinstall Julia from scratch to start over; 2) learn to use environments, temporary environments, etc.

I have gone through this, and I’m still here because, as a package developer, the pros of Julia in general (and of environments in particular) by far outweigh this not so clean experience. But less involved people may not.

This is why I insist that the user should be exposed to a temporary environment by default. The only problem there is usability associated to having to reinstall packages every time.

4 Likes

I can see this but… every programming language package system ever until quite recently just installed everything into a single shared global location and it was fine and it’s still the default in most of them. Not ideal, but totally usable. So I have a hard time believing that it’s so horrible as a new user experience. Sure, after doing a few tutorials they may have a few dozen things installed, at which point, it seems like they know enough Julia that someone can tell them about environments without blowing their minds.

I do think that adding an easier, cross-platform way to start Julia in a temporary environment would be good though. Spelling it as julia --project=@temp would be preferable to me. That way you could also use the special name @temp in the JULIA_LOAD_PATH variable and get temporary environments there.

10 Likes

No, not so horrible, we are talking about a refinement :slight_smile:

To be truth, I have little experience with other programming language package managers. My brief experience with python’s one(s) is horrible (much worse than Julia’s for sure). Thus, I cannot claim there are much better alternatives out there. One thing that is different in Julia is that in other languages the installation and usage of the packages are more decoupled operations. I can start a python section and do exploratory coding, plots, etc, and being the packages installed and I won’t be surprised by an update of every package I have installed by trying to install another package. These “general updates” are less frequent in other languages, because there is less code reuse and, of course, packages are many times monolithic blocks.

Edit: @StefanKarpinski as I mentioned somewhere: my goal when speculating about this is not to “demand” anything, or to claim that my “solution” is good enough, or to bash criticism just for doing it. I don’t even necessarily expect this level of attention to one post like this (the fact that Julia has this level of integration of the community, though, is one of its greatest virtues and it is almost unbelievable). My hope when writing this is just to maybe bring a new point of view about a usability issue I find can perhaps improved. And, maybe, by doing this, someone with the proper skills can find a way (similar or not to what I originally propose) to improve a bit further the state of things. Maybe, and even likely, the complications and tradeoffs are of another order which I don’t experience in my everyday workflow. And that is fine as well.

5 Likes

I may be missing something monumentally here, but isn’t this complaint solved by starting a tutorial in its own environment and telling users there that that’s the way to go in julia? It’s no less arbitrary than “just put some file somewhere and do some conda incantation”. It’s a sort of ritual, if you wish. If they’re not expected to know julia, it’s IMO on the tutorial writer to explain a workflow that will lead to the fewest problems down the road, and saying “Step 1: enter ]activate @my_tutorial into the REPL” seems to me to be the way with the least friction down the road.

3 Likes

That is certainly one alternative, if we agree all to start our user guides like that. But for the moment, nobody does it, despite the fact that it seems that most people agree that that is the correct workflow.

Concerning my everyday life, here I learned that with a juliadraft alias that runs:

julia -i -t auto -e "import Pkg; Pkg.activate(temp=true); Pkg.offline()"

or with the Draft.jl package that exports the handy @reuse macro:

julia -i -t auto -e "using Draft"

I get quickly in a state where I can play around more or less in the way I like to. That, for me, is useful already, and all the feedback and discussion was instructive.

I don’t think I have anything new to add to the discussion. Thank you all!

2 Likes

It was/is not fine! For example, Python/Conda is pure hell because of this.

The only difference relative to Julia is that since the packages are sometimes more mature they cause slightly fewer regressions when deliberately or upgrading packages in the shared dependency graph. But the chance that you write a python script on your computer which works at some point in time and come back to it a few months later and it doesn’t is still very high.

The only sane way to use python/conda is a virtual environment (i.e., the same thing as a local project file) - but they make it especially difficult to do that. Even then it is a miserable experience relative to Julia.

Or just make it natural for them to have an environment from the beginning so they don’t even realize it. i.e., make julia --project do the equivalent to an ] activate . if it can’t find one. they do a using Plots and it installs it adds to the project file easily now. The mental model for users is simple: “when I start up julia in a folder for the first time I add in the packages I need, and if I previously used them on this computer it will generally be fast”.

You don’t need to explain environments to them at all, just say that when starting out (and probably forever…) have a project file in your folder or in its parents which will list the packages you use.

The alternative (e.g., explaining how to nuke a big shared environment when cluttered, or explain to users why a script they wrote last week suddenly doesn’t work because a package used in another script had a downstream dependency they have never heard of, and explainming why sometimes it is in the shared (v.1.7) vs. a project environment, etc.) is much more difficult than just saying “always have a project file until you know what you are doing”.

To make sure I understand, this is for the use case of where someone is just tinkering with the REPL and not looking local to a particular file or notebook. In the later case, the temporary environment should just be a normal project file in their folder, right?

If so, then I can see this as potentially useful but seems like an advanced use case and not for introductory usage.

Bingo. And if they start it with julia --project in a folder that doesn’t find one, then it should do that automatically because they wanted a “project”. Jupyter could do that as well. In practice, new users will always be working with notebooj or .jl files local to some folder. And people writing notebooks for a tutorial should probably ship it with a Project.toml (and even Manifest.toml if you are kind) where things actually work and are tested, so the user can learn to do ] instantiate after opening them the first time e.g. https://github.com/QuantEcon/lecture-julia.notebooks/blob/main/Manifest.toml

1 Like

As an example: GitHub - QuantEcon/lecture-julia.notebooks: Download Notebooks for julia.quantecon.org .

Users are told to clone the repo, then go julia --project to start it in that folder (or, better yet, just open in vscode or jupyter which automatically does that) and then the entire setup is to call ] instantiate. That is it, and it is 100% reproducible because of the Manifest. WIthout the manifest it would still be easy enough.

Note that this is what binder does as well so if it works for binder it works locally as well.

5 Likes

This is what Pluto does by default. In my dreams, where I have the time and the skills, I would start the UnicodePluto project.

2 Likes

That is the argument I have been waiting for: sounds like a reasonable default.

Edit: I’d even go one step further: julia --file associating file.toml (only half way joking having a local environment with 80 packages like a newbie;)

1 Like

The only problem right now is that this only works if the user conciously knows whether a Project.toml file is there because of the behavior or julia --project. If it is, there (or in a parent) then everything is intuitive. Otherwise they need to know to do ] activate . to start creating one… and because they don’t know that, they start working off of the (v1.7) by default. Easy to fix though. I think 90% of new user problems would be fixed by encouraging them to always use a project file, and making the default --project behavior to create one would make it very seamless.

I gave this a lot of thought and tried to follow all the arguments in their various directions. If I’m reading this right, the problem is some of us find the current setup unsatisfactory and the solution is to use environments. Separately, the speculation is that new users will get caught using the default environment, and a speculation based on that speculation is that they will be perturbed by issues stemming from a single environment.

I ran a single environment for years with no issues, except once when I pinned a package at a specific version. Beyond that, there is no real consensus (beyond this thread perhaps) that the current setup is a real “problem” in the sense that it needs to be dealt with at the language level. And I feel like environments are explained well enough that if new users just read the docs (like all good coders do, right?) they will know enough to start using environments just like we all did. (Problem avoided?)

I would not want to mess with the simplicity of the status quo for an issue that not everyone may see as a problem.

3 Likes

I don’t think this is true in practice.

Consider two different scripts, written at different times, and using different packages but sharing a single big environment . If I install a new package for my new project it might break my old one. Not just direct dependencies either… the typical issue are downstream dependencies between the two projects overlapping (e.g., CSV.jl gets upgraded as a dependency for my new package even though I don’t actually use it direcly, but a bug in the new CSV release breaks my other script which directly used CSV.jl … Not that this happened recently :slight_smile: )

If you have two project/manifest setups for the two different scripts because they are in different folders, then upgrading one never silently breaks the other one because you never have “upgrades” occur having side effects from package operations.

And as the shared environments gets bigger and bigger, the DAG of dependencies gets bigger and bigger as well - which makes it increasingly likely that any package operation may have to change a lot of things to maintain [compat] bounds. This is inherent to the package optimization problem and nothing julia is doing wrong, but it makes shared environments with lots of packages inherently fragile (whether in conda, julia, is irrelevant… also whether it is the (v1.7) or a new (v.1.7.global) or whatever.

1 Like

I believe at least Pkg should ask us if we really want to pollute the global environment (for a couple of tools I’d answer yes) and remind us that the better alternative is to use some other kind (to be decided) of environment.

Well, I do: Examples_1p · KiteModels.jl

9 Likes

How about whenever you Pkg.add something to the default v#.# environment and it can’t be installed at the latest version, a little hint gets printed to suggest using a different project and a link to the relevant bit of the docs?

This would on-board people to the idea of projects exactly at the point where it becomes relevant.

9 Likes

IMO, the best model for a package manager is Nix.

In Nix, packages are defined completely by their inputs (content, dependencies, name, version number, code). They are indexed by a hash of all their inputs, not merely their semantic version number.

Nix keeps multiple versions of packages installed, has every package explicitly declare its dependencies, and garbage-collects unused packages. This prevents upgrades from breaking other packages while still maintaining a nice, clean global package environment.

Updates and installs either fail and do nothing or succeed completely; there are no partial installs. It is also very easy to roll back to a previous configuration.

At this point, a full redesign of Pkg would be a waste of time. Still, perhaps we could borrow some ideas from Nix, like indexing packages by a hash so they depend on all inputs, installing multiple versions of a package while garbage-collecting old ones, etc.

2 Likes