How does one set up a centralized Julia installation?

I would like to start a thread about best practice for how to set up a centralized Julia installation in a cluster environment or local institution network. Hopefully, this will lead to a tutorial like write-up in the end. I want to focus on Julia 1.0 and, thus, Pkg(3).

How does one set up a system wide Julia 1.0 installation, that is julia itself as well as commonly used packages?

Requirements:

  • Users should still be able to install packages not present in the central directories.
  • As packages might be out of date, users should also be able to install newer versions of packages locally.

From my (very limited) point of view, these could be potential issues:

  • While the system admin updates and precompiles packages of the central installation, running user’s jobs might break because they might try to access the old precompiled resources.

Let me note that this question came up before Julia 1.0 and Pkg3 and has been (sort of) postponed. I thought it might deserve a clean new thread.

27 Likes

I would also add these few constraints:

  1. how to configure per user where packages will be:
    1.a. downloaded
    1.b. build
    1.c. precompiled
  2. how to configure where custom julia envs will be created:
    2.a. per project
    2.b. per user
  3. are those paths are stacked on top of System Wide ones or they can optionally replace (shadow) each other?
  4. how to merry Julia with external conda environments (to restrict conda to be downloaded by Julia internally and external conda from PATH to be used)
  5. how to merry Julia environments with jupyter notebooks or hub?

To be honest - it would be nice to have a tutorial covering these use cases.

2 Likes

Has there been any update on this? My IT deptartment has very strict home directory quota’s for students which makes it very difficult to teach Julia in the classroom via virtual machines. I have tried to create a project on a shared drive, but that still seems to install packages to local directories.

I have also tried editing the JULIA_DEPOT_PATH to a directory writable by the instructor, but can’t seem to make that work either (referencing: Where or How to edit DEPOT_PATH)

2 Likes

I’d also like to see some helpful feedback here. Currently we also have just a group installation and every package is installed in the users directory, which – as drwx pointed out – is a bit annoying due to quota restrictions.

This also reminds me of a recent discussion about a CLI for installing packages (missing reference) here on discourse.

I’d like to wrap up a workflow in Python which works quite well for us:

  • one centralised Python installation which is occasionally updated (with an official announcement, so everyone is notified and can be prepared)
  • a set of commonly used packages are already installed (Numpy, SciPy and other stuff)
  • users can install packages on their own, using pip install --user PACKAGENAME
  • users can create their own isolated environments to play with or for reproducibility of scientific results etc.

This is more or less done by manipulating $PATH and $PYTHONPATH and at least in my experience it works quite nicely.

What I wish for Julia is indeed something which can be set up using command line tools, so that you can easily provide those setenv.sh scripts for group installations (we use them a lot), or even with the Modules system.

Something like juliapkg install ... to install a package, juliapkg install --user ... to install it in the own JULIA_PKG_DIR or whatever (and found when doing using ... in Julia code), juliaenv create ... and juliaenv activate ... to activate an environment which is then of course used when the user installs new packages via juliapkg install ... etc.

I think this would nice :wink:

8 Likes

I need that as well… Hope someone comes up with a solution soon

1 Like

I’ve been bashing my head off a wall with this for many hours now. My use case it trying to get Jupyterhub working as a web interface for many users to access Julia. This requires installing the IJulia package in such a way that it is accessible for all users without them first having to SSH in to install it.

I used to be able to do this on v0.6-ish by using the JULIA_PKGDIR environment variable to point to a central directory before installing as root, then adding the same path to the LOAD_PATH julia variable via juliarc.jl – basically what is now referred to as “Old Post” here: server - Install just one package globally on Julia - Stack Overflow

But, this workflow is broken in v1.0 …

I’ve tried replacing JULIA_PKGDIR with JULIA_DEPOT_PATH as suggested in that updated Stack Overflow post, but this creates a lot more than just a package directory in that location (it seems to setup a whole project environment for the root user in the central location). LOAD_PATH then no longer seems to pick that up, so I have to add to the DEPOT_PATH via the startup.jl script and also chmod the directory structure created in that location to be readable by all. These are the steps:

sudo mkdir -p /opt/julia-packages/
sudo sh -c "echo 'push!(DEPOT_PATH, \"/opt/julia-packages/\")' >> /usr/etc/julia/startup.jl"
sudo JULIA_DEPOT_PATH=/opt/julia-packages/ julia -e 'using Pkg; Pkg.add("IJulia")'
sudo chmod -R uog+r /opt/julia-packages

This then works … sort of. It is basically enough to be able to load the Julia 1.0 kernel in Jupyterhub, but if you try to install a package as a user, then contrary to the documentation (which states that all depots except the first are read-only) it tries to install the package in both ~/.julia and in the central depot where IJulia was installed and obviously fails as non-root can’t write there.

Any insights greatly appreciated, because I’m on the point of giving up and removing this feature from my community project.

7 Likes

I tried to

ln -s /opt/julia/packages /home/user1/.julia/packages
ln -s /opt/julia/compiled /home/user1/.julia/compiled

but then I meet permission issue, even if I use chmod a+rw -R /opt/julia.

The permission issue happens in this way:

  1. use root to add some packages
  2. use user1 to add the same packages and build → ideally julia will find the compiled package and hence no recompilation will happen, but it raises stat: permission denied (eacces) error

I just gave it up and not yet figure out how to solve this

1 Like

That’s likely because of two issues:

  1. Once root adds some packages, they’ll be owned by root. Use POSIX ACLs (with setfacl) to fix this.
  2. Precompile (.ji) files encode paths in them that are absolute. Each user will thus need their own directory for .ji files, as they can’t be shared.

Edit: If you want help with bullet 1, please send me a DM here. I’m pretty good with POSIX ACLs for simple tasks.

1 Like

I think I kind of succeeded in manually setting up a centralized Julia installation and some pre-installed packages. I don’t know if this is best practices (probably not) but it is working for now. In general, if one has the resources, it is probably worth looking into JuliaTeam However, here is my setup in random order:

  • For each user, set the JULIA_DEPOT_PATH environment variable to a stack of paths where the first points to a user-writeable directory and the second is read-only, e.g. export JULIA_DEPOT_PATH=/home/$USER/julia_depot:/global/depot/path
  • when working as admin to install and update packages, reverse the order of the paths, so that the central installation comes first
  • Registries delete the General registry from the global repository, e.g. delete the folder /global/depot/path/registries. Otherwise when the user runs pkg operations the package manager would always try to update all regirstries, but does not have write access to the global one, also the package manager would keep asking from which registry to install a package. If you want to maintain an internal registry to register internal packages, this is the place to put it
  • now, as admin user you create a shared environment e.g. (Pkg.activate("globalenv",shared=true)) and install all the packages you want to be available for users in this environment the environment will be placed in /global/depot/path/environments/globalenv
  • to make the global packages loadable by the user you have to add this environment to the user’s environment stack. This is done by modifying the LOAD_PATH, so the user must add either run export JULIA_LOAD_PATH="@:@v#.#:@stdlib:/global/depot/path/environments/globalenv" from their bashrc or directly push the env path to their LOAD_PATH from within julia.

Hope this helps and good luck setting this up.

14 Likes

And I forgot to mention: Reading this page: Code Loading · The Julia Language really helped a lot in understanding what is going on…

So I just had some time to play around with this. Thanks @fabiangans for your attempt! However, it’s not working properly (and also feels like fighting the package manager :D). I reproduced what you described and observed a couple of issues. Before I get to them, let me add that one should probably export JULIA_PROJECT=/global/depot/path/environments/globalev as administrator as well, as otherwise a v1.0 environment will be created by default which conflicts with same-named user environments.

Some of the issues that I observed with aboves setup:

  • Everytime the admin adds/updates packages he has to remove the General registry folder again. (Maybe you didn’t see this because you said “reverse the order” in which case the users registry will likely be used.)
  • Packages get shared but not their precompilation. Admin can add packages but musn’t precompile them. Otherwise the users julia will try to use the precompiled *.ji files which fails with permission error.

Even worse, there are inconveniences on the user side:

  • If user tries to gc and doesn’t have all the globally installed packages in his current environment (which is basically always the case) he will see warnings:
(v1.0) pkg> gc
    Active manifests at:
        `/home/carsten/.julia/environments/v1.0/Manifest.toml`
┌ Warning: Failed to delete /opt/julia/depot/packages/BenchmarkTools/dtwnm
â”” @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Pkg/src/API.jl:370
   Deleted /opt/julia/depot/packages/BenchmarkTools/dtwnm: 143.768 KiB
┌ Warning: Failed to delete /opt/julia/depot/packages/JSON/Hs3Dj
â”” @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Pkg/src/API.jl:370
   Deleted /opt/julia/depot/packages/JSON/Hs3Dj: 79.503 KiB
   Deleted 2 package installations : 223.271 KiB

And these are just the issues that appeared after 2 minutes of playing around. So this really isn’t a solid setup. Thank you anyway for sharing it with us! It seems to me that the “administrator depot” currently actually isn’t one but is much more “just another depot”.

1 Like

It would be really useful if one of the Pkg developers could comment on either how to do it properly or what the future plan is. Let me (hopefully not rudely) ping @fredrikekre and @StefanKarpinski on this.

Is it/will it be possible to set up a proper administrator depot to provide packages (and maybe their precompiled .ji files) to local users?

(On a personal note, my local system admin is increasingly complaining about all the julia users who all have thousands of redundant files in their home directories.)

2 Likes

Seems like a lot of the issues stems from the fact that we modify all depots. Pkg should probably only modify whats in DEPOT_PATH[1], and only update registries in first depot path by KristofferC · Pull Request #733 · JuliaLang/Pkg.jl · GitHub is one example of that. I guess we should not gc from other depots either, etc etc.

1 Like

As far as I can tell, Pkg trying to update depots that it doesn’t have write access to seems like the only thing preventing this from working currently. The default DEPOT_PATH is something like this:

3-element Array{String,1}:
 "<home>/.julia"
 "<system>/julia/usr/local/share/julia"
 "<system>/julia/usr/share/julia"

The first is the user depot, the second is the shared arch-specific depot and the third is the shared arch-independent depot. There’s not much tooling to help you set this up but it shouldn’t be too hard to do. If you’ve been trying to do this and encountering issues with the basic arrangement then honestly, you’re probably in the best position to help make it work. There are way too many different ways to use the package manager to have actually tried them all out and worked out the rough edges and tooling at this point. We’ve still got our hands full with switching the pkg dev tooling over from METADATA to new registries.

Since Julia published version 1.0, I started exploring Julia today and want to see whether it’s a programming language that can replaces Python. Similar to many others, my first choice is to set it up in JupyterHub so that my learning can be more efficient with notebooks. This issue is really blocking me right now. Hope this can be fixed soon.

2 Likes

The way I currently do things is that I put Julia in /opt/julia and each user sets JULIA_BINDIR appropriately. To be honest, the reason this has always been sufficient is docker: tha majority of work gets done within user directories, anything else gets put in a docker image anyway.

I suppose the way you set up a central installation depends on what you are trying to achieve. A quick and dirty solution might be adding packages that you want in a centralized location using LOAD_PATH (or simply doing Pkg.dev) though this would leave your version control to git itself, so it’s certainly not ideal.

I ran a similar set up mentioned by @fabiangans, but below is what I did in hopefully clearer terms for another user. I still agree with @carstenbauer points, but this will work for my use case, for now (until it doesn’t).

It feels to me like PackageSpec should be extendable for an administrator or shared library case, maybe that’s the way to move forward (warning, Julia noob here).


As administrator
Spin up a Julia session and run (this can go into an admins startup.jl):

empty!(DEPOT_PATH)
push!(DEPOT_PATH,"/path/to/shared/packages/") 

Now install a package via the package manager (v1+) then exit, and initialize the package:

(v1.0) pkg> add Gadfly
julia> using Gadfly
julia> exit()

For the User
I have created a startup.jl file that I store for each user in ~/.julia/config/startup.jil . This file looks like:

empty!(DEPOT_PATH)
push!(DEPOT_PATH,string(ENV["HOME"],"/",".julia"),"/path/to/shared/packages/")
push!(LOAD_PATH,"/path/to/shared/packages/")

Then, I just made sure all of the files were rx by all users on the system (not ideal).

3 Likes

Trying out what @drwx suggested.

Minor trouble with the story for developing local packages. Special circumstance here is that users will have restricted internet access. No access to github or curl, ( any command line tools using internet ).

TL;DR how can I prevent adding a shared package to a user developed package from going to github for the registry?

Full story:

Also added a <dev_path> to end of LOAD_PATH for user

I added StaticArrays as administrator. Which is good. Users can “using StaticArrays” just fine.

As user I generated two packages in <dev_path> Foo and Bar.
Foo uses StaticArrays and Bar

from <dev_path> I load julia:

shell> cd Foo
<dev_path>/Foo

(v1.0) pkg> activate .

I added Bar to Foo like so

(Foo) pkg> dev ../Bar
 Resolving package versions...
  Updating `Project.toml`     
  [61032482] + Bar v0.1.0 [`../Bar`]
  Updating `Manifest.toml`          
  [61032482] + Bar v0.1.0 [`../Bar`]

Now I want to add StaticArrays to Foo.

(Foo) pkg> add StaticArrays
  Updating registry at `/path/to/shared/packages//registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
┌ Warning: Some registries failed to update:
│     — /path/to/shared/packages/registries/General — failed to fetch 
from repo
â”” @ Pkg.API 
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Pkg/src/API.jl:144
 Resolving package versions...
  Updating `Project.toml`
  [90137ffa] + StaticArrays v0.9.2
  Updating `Manifest.toml`
  [90137ffa] + StaticArrays v0.9.2
  [2a0f44e3] + Base64
  ...

(Foo) pkg>

Not a disaster because the Manifest.toml and Project.toml for Foo are good. I can edit Bar and Foo will recompile as a result. So dependencies are working. I can use both Foo and Bar. Yay.

Only real problem here is that the user has to wait ages until the internet check fails when it goes to github or the General Registry. Anything I can do to avoid going to github?

Rereading the above posts and looking at the merged PR, it seems this will be resolved in the next 1.0.x release if I keep my user DEPOT first in that list?

@Orbots I had the same problem.
I have a cluster system that should not connect to the Internet. So first I download my needed packages and past them into .julia/packages/ path. Then to prohibiting the Julia to connect to the Internet, I just simply change the value of repo in the Registry.toml file from https://github.com/JuliaRegistries/General.git to /home/username/.julia/registries/General. It works for me. but how? I don`t know.

@alirezamecheng Nice. That’s a simple solution. Much easier than creating a registry from scratch.
You need to switch repo back if you want to install packages though, eh? You can probably have two Registry.toml files, one for “user” and one for “admin”. But then you need to sync them. Still not ideal, but it would get the job done.

I have a ticket in Pkg.jl project for this. With a potential fix ( I haven’t tested it ). So hopefully this will “just work” soon enough by setting JULIA_DEPOT_PATH and JULIA_LOAD_PATH env variables appropriately for the two roles.

https://github.com/JuliaLang/Pkg.jl/issues/892
https://github.com/JuliaLang/Pkg.jl/pull/906