JULIA_DEPOT_PATH and JULIA_PROJECT

I am trying to automatically pick and choose which environment (and ultimately the packages therein) to use. For the purpose of testing only, I have ~/.julia and ~/.julia_dev directories with packages and environments defined in them. Now, I want to seamlessly switch between the two. My understanding was that if I set the julia’s JULIA_PROJECT environment variable (I understand that flag --project=/path/to/some/folder takes precedent over this env variable, but since I am not using --project flag anywhere, this does not concern me), I could switch to either one of the environments and use the corresponding packages defined in that environment. To test this, I did the following:

$export JULIA_PROJECT=~/.julia_dev/environments/v1.6
$julia
julia> using somePackage
julia > println(pathof(somePackage))

This printed path of the package in ~/.julia but not in ~/.julia_dev even when I am explicitly telling julia to use the project environment from ~/.julia_dev.

I also tried:

using Pkg
Pkg.add(~/.julia_dev/environments/v1.6)

using somePackage
pathof(somePackage)

This returned the same output as above.

Next, I tried:
$export JULIA_DEPOT_PATH=~/.julia_dev and
$export JULIA_PROJECT=~/.julia_dev/environments/v1.6
and then trying the above command, finally showed me the correct path of the package I was using (that is, ~/.julia_dev/path/to/some/package).

Does this mean that even when one defines JULIA_PROJECT, JULIA_DEPOT_PATH will ultimately dictate where the package is loaded from (because I know that the first entry of the ‘DEPOT_PATH’ by default is ~/.julia folder when I type it in julia REPL. If I export JULIA_DEPOT_PATH, obviously DEPOT_PATH will be set to whatever I’ve defined)

Obviously, I find this whole episode/experiment extremely confusing coming from C++ and Python, where all one needs to set are their PATH (for binaries) and LD_LIBRARY_PATH (for dynamic libraries at run time), PYTHONPATH (for python related dependencies) to use the correct libraries/packages).

I don’t quite understand the point of JULIA_PROJECT, if JULIA_DEPOT_PATH ultimately dictates where packages are loaded/used from.
Do we really need to set these two variables if we want to use the correct packages for the selected environment? Thanks for any help.

[PS I use julia within vs code (using julia extension). If someone knows how to set the julia’s environment correctly in vscode, I’d appreciate that very much. Thank you!]

Hey apologies for possibly the obvious question, but why are you doing package management this way?
Are you trying to have two “global” julia environments or whenever you switch to a directory via a REPL session, the appropriate environment gets activated?
Or is this more of a VSCode scenario?

This was just an experiment to understand how Julia uses environment and which packages get loaded when certain environments are activated. Looks like simply activating an environment does not really load the correct package, which is a bit confusing. I am sure I am missing something here. While this was an experiment, it is quite possible to have julia packages to live wherever user wants and seamlessly switch between packages and environment as they see fit.

I just wanted to have a full and complete control over all my packages, how they are loaded and how we can select environment directly from the command line since I don’t see how one can pick and choose environment and packages other than using REPL(which is something I want to avoid). Using command line to pick and choose environment and packages makes things configurable and hopefully ease things a bit in vscode (or any good IDE for that matter), where one can define these environment variables in launch configuration.

I think you are misunderstanding the way julia packages work.

JULIA_DEPOT_PATH is where thepackage manager stores downloaded packages. Except for the “scratch spaces”, the packages are just a replicate of the source code. So the user only need to specify which and what version of the packages he wants to use. This is exactly what is recorded in a julia project, namely, the Manifest.toml. Also, to allow updates to these packages in a controllable manner, the Project.toml records the version constraints. The exact versions and the constraints compose of a reproducible julia package. These two files are managed by the Pkg.activate command, and a default pair of them in JULIA_DEPOT_PATH is activated if you don’t do that explicitly or activate one with JULIA_PROJECT.

I see you are replicating the package storage (as pathof shows where the package source is located in your computer), this is usually not necessary and might waste hand drive space.

For your confusion, the C linker does version management by specifying the exact version if needed, while otherwise use the default version. In this mechanism, switching package repo can influence the choice of libraries by changing the default one. However, julia always uses the exact version in Manifest.toml, and resolve one set of versions from the constraints specified in Project.toml if Manifest.toml is not available. As a result, swithing the package storage shouldn’t change anything.

3 Likes

Thanks for your note and clarification.

As for the replication, yes it is a replication but you can’t avoid replication when you intend to have working/production module/code separate from development versions of the same package. Think of julia and julia_dev as two folders with custom modules (tiny modules, memory is not an issue).

What I am still not clear from your explanation is that, despite selecting the environment from the development code, pathof(package) indicated production code path located in .julia folder.

If I modify the source code of the package in development, how can I be sure that it will be the one used when pathof(package) indicates otherwise?

I am particularly interested in the case when one modifies/edits the source code in development folder. ( in the example I presented above). While project.toml and manifest.toml might have the version number held fixed, but if the source code of that module is modified then simply activating the environment is not enough because environment simply looks for version number defined in toml files. In such a case, does one also need to export JULIA_DEPOT_PATH variables so that any modifications in the source code/package is correctly propagated?

The answer is that you clone the code you would like to edit, include it as a “direct” dependency out of version control with Pkg.develop. The package storage is left immutable.

The package manager has one recommended way to add packages by specifying package name (UUID) and version, and a copy of the package will be downloaded in JULIA_DEPTH_PATH. But you can also add packages from GitHub or local git repository, or add a package from a local directory. Adding from git will track the package version by which commit you are using, but will not add its dependency. Adding a package from a local directory only adds it to the “search path” and is ideal for development.

Unless you intend to modify a lot of packages, you can just add a specific location to Pkg as in-development source and start testing.

2 Likes

Three questions about the same topic:

and here.

The two others necroed old topics.
Please, OP, this is not the way to ask, it wastes our time, because e.g. I started in the first topic I saw, and afterwards I found the other questions with good answers.

I’m not sure where to start trying to untangle this but I think the most important points are:

  1. You don’t need multiple depots for your use case.
  2. Never modify the code loaded by a package you have Pkg.added.
  3. If you want to develop a package, use Pkg.develop instead of Pkg.add. If you give a path to Pkg.develop it can be located wherever you want, unrelated to the path of your depot.

To expand the answer, consider the following session:

First make a new environment and Pkg.add the Example package.

$ mkdir /tmp/production
$ export JULIA_PROJECT=/tmp/production
$ julia -e 'using Pkg; Pkg.add("Example")'          
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `/tmp/production/Project.toml`
  [7876af07] + Example v0.5.3
    Updating `/tmp/production/Manifest.toml`
  [7876af07] + Example v0.5.3

This creates two files in the environment.

$ ls /tmp/production
Manifest.toml  Project.toml
$ cat /tmp/production/Project.toml                            
[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
$ cat /tmp/production/Manifest.toml 
# This file is machine-generated - editing it directly is not advised

julia_version = "1.7.0"
manifest_format = "2.0"

[[deps.Example]]
git-tree-sha1 = "46e44e869b4d90b96bd8ed1fdcf32244fddfb6cc"
uuid = "7876af07-990d-54b4-ab0e-23690620f79a"
version = "0.5.3"

Project.toml only records the identity of the Example package. (Potentially there can exist lots of Example packages but the uuid is unique.)
Manifest.toml records a bit more information, specifically the version number and, importantly, the git-tree-sha1. The latter is a content hash of the package and is what really identifies the specific version of the package used in this environment.
We can check with pathof where this package is loaded from.

$ julia -e 'using Example; println(pathof(Example))'
/home/gunnar/.julia/packages/Example/aqsx3/src/Example.jl

Yes, indeed, it’s stored in your depot in ~/.julia. That’s not important. The aqsx3 slug is computed from the content hash (and some other information, I’m no expert on the full details) and you are not going to modify anything in this location. You can use a variety of versions of this package from different environments. They are all going to be stored in your ~/.julia depot but they are not going to clash. Just don’t modify anything in ~/.julia/packages.

Let’s now consider the development scenario. Make a new environment and Pkg.develop the Example package.

$ mkdir /tmp/development
$ export JULIA_PROJECT=/tmp/development/
$ julia -e 'using Pkg; Pkg.develop("Example", shared=false)'
     Cloning git-repo `https://github.com/JuliaLang/Example.jl.git`
   Resolving package versions...
    Updating `/tmp/development/Project.toml`
  [7876af07] + Example v0.5.4 `dev/Example`
    Updating `/tmp/development/Manifest.toml`
  [7876af07] + Example v0.5.4 `dev/Example`

Here I used the shared=false option to Pkg.develop, causing it to be cloned within the environment.

$ tree /tmp/development/
/tmp/development/
├── dev
│   └── Example
│       ├── docs
│       │   ├── make.jl
│       │   ├── Project.toml
│       │   └── src
│       │       └── index.md
│       ├── LICENSE.md
│       ├── Project.toml
│       ├── README.md
│       ├── src
│       │   └── Example.jl
│       └── test
│           └── runtests.jl
├── Manifest.toml
└── Project.toml

Without shared=false it would have ended up in ~/.julia/dev. I could also have cloned it manually wherever I liked to have it and given that path as argument to Pkg.develop.

Now the Manifest looks a bit different.

$ cat /tmp/development/Manifest.toml 
# This file is machine-generated - editing it directly is not advised

julia_version = "1.7.0"
manifest_format = "2.0"

[[deps.Example]]
path = "dev/Example"
uuid = "7876af07-990d-54b4-ab0e-23690620f79a"
version = "0.5.4"

Instead of specifying a content hash it shows the path where the package is developed. And indeed, that is where it’s loaded from.

$ julia -e 'using Example; println(pathof(Example))'
/tmp/development/dev/Example/src/Example.jl

Feel free to modify the files in this path.

9 Likes

@oheil, my questions in those threads are related to vscode julia extension. It doesn’t have to do with developing packages in julia( unrelated to vscode).

Thanks for detailed responses @GunnarFarneback and @melonedo . Appreciate it.

In the example, can dev version of Example package continue to rely on other dependent packages, (example package can have many dependency, either custom packages or general julia packages, which one may not want to modify/edit) in the production?

Also, how do I now use this dev version of Example package in the julia main file (say test.jl file had using Example; using somePackage; where somePackage is production version, meaning unmodified, but Example is the modified version) along with other production versions of its dependencies? How will julia know to pick dev version of Example and continue using production version of somePackage? I am sure we need to modify some toml files, but which ones (even small developments need to be assigned a new version number and this version number must be propagated all over to correctly use this modified version?) .

It looks like julia automatically uses the dev version, when I do ‘using Example’ after Pkg.develop(“Example”). How can I make sure that it uses production version or the development version based on my command (not automatic)?

EDIT: In addition to my questions above, how does one edit a custom registry? One can have julia general registry and custom registry for production packages. If one of the packages in the custom registry (or one could also dev a package from general registry) needs development, one can do pkg> dev somePackage and pkg> free somePackage to switch in/out of that dev package. However, this leaves out the custom (or general, if package from general is being developed) registry untouched so far. How does then one edit/modify the custom (or general) registry in the dev mode so that one can check custom (or general) registry is working fine in the dev mode before touching anything in the production mode?

1 Like

Yes.

It’s all controlled by the Project.toml and Manifest.toml files in the environment you have activated. These are modified by using the package manager (i.e. the Pkg package).

By activating the appropriate environment.

I recommend that you use the LocalRegistry package.

2 Likes

Thanks for the pointers. I will checkout the LocalRegistry package.

“By activating the appropriate environment.”

I am not sure if I follow this. Say I have two packages: Example from the dev and somePackage from the production.

test.jl

using Example
using somePackage

Example package is in the dev dir with all toml files and stuff. somePackage is in the production dir with its own toml files.

How do I activate the environment so that it uses Example from the dev and somePackage from the production?

If I activate the environment using Pkg.activate(“/path/to/toml/files/of/dev/example/package”), then this will activate environment for Example package only, missing somePackage.

Do I need to type in multiple activate:
julia> Pkg.activate(“/path/to/toml/files/of/dev/example”)
julia> Pkg.activate("/path/to/toml/files/of/production/some_package’)

EDIT: I looked into LocalRegistry package, which is quite handy. One thing I am still not clear about is, how this package deals with development version of the registry itself. I have a custom/private registry in github, which is a production grade registry, which has production grade packages registered. Obviously I don’t want to modify this production registry. Just like with packages, Pkg.develop(“somePackage”) creates a clone of somePackage under ./julia/dev, is there a way to create a clone of custom registry under /julia/registries to /.julia/dev/registries and then start editing the content of the custom registry under /dev? I am not sure if we ever need to do this at all.

One reasonable source of confusion here is that environments, with Project.toml and Manifest.toml files, are used in two rather different roles.

The first is package environments. In this case Project.toml is the important file, which declares the dependencies of this specific package and preferably also compatibility requirements for the package. The corresponding Manifest.toml mostly matters for running package tests. You usually don’t activate a package environment unless you want to modify the package dependencies or run its tests, i.e. when you’re actively developing the package.

The second is application environments, which are not related to a specific package. This is what you want for defining your production and development environments. The environments I created in my first reply in this thread were of this kind. The general idea is that you create some script that launches your application and use Pkg to add the direct dependencies of this script; indirect dependencies are handled by the package manager. If it is a development environment you may want to develop rather than add certain dependencies, including indirect dependencies if you need to develop some of those. For application environments the Manifest.toml is very important since it’s the key to reproducibility.

One important point here is that when running your application you will activate an application environment, and the Manifest.toml in this environment decides which version of all packages will be loaded. Your application environment will depend on a variety of packages, which each of them have their own package environments, possibly including a Manifest.toml file. None of those Manifest.toml files matter when you have activated your application environment. In fact it is a common recommendation not to commit Manifest.toml for packages to version control.

To be clear on this you can’t activate multiple environments. You should only activate your application environment, which will contain pointers to all dependent packages.

No, there is no concept of development registries, neither in LocalRegistry, nor in Pkg itself. Potentially you could play around with git branches of your registry but you would need to handle all tooling around that yourself.

The basic way to deal with development versions is to just Pkg.develop them locally and register new versions after the development is finished. If you want to share your development with others who don’t modify the code themselves, or deploy a test version of your system, the normal approach is to Pkg.add a specific branch or a specific commit such as Example#my_cool_features (no, there is no such branch in the Example repository so it won’t work to run that). You could also just register development snapshots whenever you want to share your progress and possibly use some version number convention to indicate what is stable or not. An observation here is that your production environment will keep pointing to whatever version is in its Manifest.toml even if you add more versions in the registry, and with appropriate compat requirements in its Project.toml you can avoid mistakenly updating the environment to versions it shouldn’t use.

On a final note, when I talk about activating an environment this means using either of the following mechanisms:

  • Pkg.activate.
  • Starting julia with the --project flag.
  • Setting the environment variable JULIA_PROJECT before starting Julia.
8 Likes

@GunnarFarneback Thanks a lot for this detailed answer!

@GunnarFarneback I am still confused about the production and development environments when I have to deal with actual scenario (which is a bit different from “Example” package above).

Real world scenario often encountered:

I have a production grade application (“App”) which relies on 100’s of custom packages (production), along with (many) custom and (one) general registries. I want to develop/augment new features, say for example, in only one of the files (some.jl) of one of the packages named “somePackage”, keeping the rest of the packages intact. This package has many dependencies (both custom and generic), which I may or may not want to develop (interested in both cases). Each project has its own toml files (so called environments) and the base environment for “App” is whatever is in the ~/.julia/environments/v<#> since this is the base environment that gets activated and production packages were added one-by-one after this environment was activated.

Obviously, I want a full and complete control over how I build my packages, and seamlessly switch between production and development modes. Following your suggestions above, I decided to proceed with the following steps:

When the production environment was still active (that is ~/.julia/environments/v<#>, in the pkg mode I ran:
pkg> dev somePackage

This command immediately modified the toml files (or at least did something there) so that the toml files in ~/.julia/environments/v<#> now would point to the development version of the “somePackage” and not the production version.

Before proceeding further, I found this a bit bothersome: It ended up modifying my production environment (~./julia/environments/v<#>. I understand that I could run pkg> free somePackage to get back to the same environment as before, but the fact that my first dev step led to modification of the production environment, which I never want to modify ever unless I am done with my development.

Instead of doing it this way, I decided to run julia from the terminal with JULIA_PROJECT variable set, which cloned the development version of the package in some dedicated dir ~/.julia_dev (or whatever I want). While I am adding a new feature in one particular file of this package, ultimately, I will need to test this new new feature in the broader context of the “App” I mentioned which depends on many other custom and generic packages, which I have not modified.

I am confused about setting the environment for the “App” during this development process (not the environment for the “somePackage” which is all set). Since “App” requires many other packages, this is what I did:

$ cd ~/

$ mkdir .julia_dev

$ cd .julia_dev

$ export JULIA_PROJECT=.

$ julia

julia> using Pkg; Pkg.develop(“somePackage”, shared=false)

This created an “environment” (so-called toml files) in ~/.julia_dev, which right now has only “somePackage” in under [deps] section of the toml files.

Then I activated this dev environment:

$pkg> activate .

Then I added (pkg> add otherPackage) all the custom packages that are required for the “App” (which I am not planning to develop) one-by-one in this development environment (under ~/.julia_dev). For any package I intend to develop, I did Pkg.develop(“somePackageToDev”, shared=false), which were then rightly cloned into the ~/.julia_dev folder.

Is this really how we are supposed to deal with testing “App” when one of the packages that this “App” uses in being developed? Specifically, is it correct to do : Pkg.add(“somePackageNotDev”) and Pkg.develop(“somePackageToDev”) when the goal is to test the final “App” which relies on both dev and undev packages? Please note that I am not interested in simply activating the environment for one package that I am developing, but want to test that package in the broader context of “App” which depends on many custom and generic packages that I don’t have plan on developing yet.

If there is a better way, please let me know. Thanks a lot for your time.

Another real world scenario I often encounter is when I am working on two contrasting projects (say project1 and project2, RocketScience vs InsectScience where there is absolutely no overlap between the projects). Both projects have their own custom registry and say use same julia version for simplicity. When I add these registries using :

pkg > add registry <>

They both get added into ~/.julia/registries, which I guess is fine.

Next, when I work on project1, I will begin adding packages for it and all of which gets dumped into ~/.julia/packages and compiled code under ~/.julia/compiled.

Upon switching to project2 and following the same steps as above, I will have custom packages that are relevant for project2 also dumped into ~/.julia/packages and ~/.julia/compiled, which I think is quite confusing.

Situation like this never occurs with Python. Once can do ‘poetry install’ (of course I am using poetry which is not an in-built python feature, but that is not the point) from the respective project/package folder with toml files to have all the dependencies (custom or generic/public) dumped into .venv folder of the respective project/package by default (meaning, by design, projects/packages are separated which is not the case here with Julia, where, by default ,everything gets dumped into one place creating confusion down the road if I want to get rid of some unused packages, custom or generic/public, of one of the projects that I no longer work on. In Python, I can simply remove the project/package folder which clears the .venv folder for that project/package from PYTHONPATH avoiding any confusion.

I understand that I can work on project1 by activating its environment and all and switch to project2 and activate its environment, but my point is for both projects, all the packages are being dumped into one location, namely, ~/.julia/packages. If I want to wipe clean project1 and its dependencies, will I have to do that manually, searching for those project1 specific packages under ~/.julia/packages and ~/.julia/compiled dirs?

I am sure it is possible to do the Python-like separate installation of packages and its dependencies in Julia, but I am not sure how. If anyone knows please share.

Thanks.

Since you are posting a lot of code, note that you can use some basic markdown syntax (namely, “`” to quote inline code and “```” to quote multiline code) to format you code to make it easier to read.

Just like in python we dump requirements.txt/.venv in the project home, in juliawe store Project.toml and Manifest.toml in the project home. The base environment in ~/.julia/environments/v<#> is usually for quick experiments that do not deserve a new folder or utility packages like Pluto or PackageCompilers.

I am not sure about your confusion, do you find the process tedious? If so, you can just copy and paste Project.toml and Manifest.toml to copy a project and make your modification based on that.

Since ~/.julia/packages is just a cache of remote server, and ~/.julia/compiled a cache of package precompilation, I do not see any problem in sharing them. As what cache means, you can delete these folders completely if you like (]gc will also do some clean-up), and instantate and precompile commands will recover both. You are not meant to modify them.

I admit that some packages are indeed modifying the itself in the build step, but since the build directory is not shared among different package versions, I believe package authors do not want to make build step more than downloading and compiling some resource.

Maybe you are unclear about julia’s package management model. Python and C somehow did not take “local” packages seriously and installs everything in the global environment, so the only approach to make isolated environments is to fake a global environment. In Julia, we take reproducible environments very seriously, as a result, an environment is modeled by set of strictly versioned packages which can be found by using or import, with exceptions for mutable state and in-development code. With this model, their is no global environment, just a default local environment v#.# which is only for convenience.

2 Likes

Thanks for this information.

Just wanted to clarify that I do not want to constantly wipe clean /.julia/packages and /.julia/compiled folders. I want to clean packages and its dependencies for, say, project1, which I am no longer working, but want to keep everything intact for project2. How do I wipe clean project1, without touching project2, automatically? If I can’t cleanly remove project1 and its dependencies, then it leads to unnecessary bloating of /.julia/packages folder (regardless of whether it is cache or not).

Yes, I do find it a bit tedious to constantly add packages (that “App” requires) that I do not intend to develop and am developing only a couple packages that this “App” depends on.

Bloating is indeed a problem, I do not find good solutions other than buying a larger hard drive either :rofl:

A simple workflow is like this.
First, you make your “production” app its own environment, with all its dependencies in its Project.toml & Manifest.toml files. This is effectively the only reasonable way forward anyway.
So, you have your app with its “production” dependencies up and running, and it doesn’t depend on global ~/.julia/environments/v#.
When you need to develop some package PkgA.jl in the context of your app, copy the complete app directory with Project & Manifest files. Do ]dev --local PkgA with this new (copied) environment activated - this puts PkgA in ./dev/PkgA directory. Editing files in that dir only affects your app in the “development” environment, and not the original one.
This is reliable and easier than anything I tried in Python (several years ago).

3 Likes

The pkg> gc command cleans up any package versions (and artifacts) that are no longer accessible via any manifest file that you’ve used. If you want to remove all the versions for a project, you can delete its manifest and then garbage collect. This will, by default, still keep things around for a week, but you can force it to collect unreachable packages and artifacts immediately with pkg> gc --all. In current versions of Julia, the gc command is run automatically every now and then, so unless you’re actually running out of disk space, you don’t have to do anything—once something hasn’t been used for a while, it will get cleaned up automatically.

By only having one copy of each package version and artifact version (especially important when artifacts are large and the exact same version is used by many versions of a package or even multiple different packages), Pkg saves space since these are shared across all environments that use them. It seems like poetry must install multiple copies of each package version even if the same versions of packages are used by many environments. The ~/.julia directory does tend to become quite large, but that’s mostly because, unlike Python which tries to use whatever system kind happen to be installed, we install isolated pre-built (immutable, content-addressed) copies of most binary dependencies. This fact is why Pkg is so reliable and reproducible. You can override this so that packages use system libraries instead, but it’s not recommended unless you really have to.

10 Likes