Best practice for packages on shared drives

Hi

We have julia running on both linux and mac, and on intel and M1, and the julia versions might not the same. We use synced folders which contains all the code for our in-house packages, the applications that use the packages and data and output. The .julia folder is specific to the system. This code runs unattended, and we might not log onto some computers for months, and hopefully eventually years.

And it all works so long as we only run a package on one system only. But not if we run it on many, apparently our in-house package dependencies all get messed up. So we end up having to install the julia packages again and resolve, which somehow ends working after some time, but then it messes up the other setup that used to run fine.

My team members tell me that its the fault of julia package system and all would be good if we stopped using packages, adding if only julia’s package system for in-house packages was like that in R. I can see that would work, but stop using in-house packages seems not the best practice.

So I come to you to ask what is the best practice to allow our in-house packages to run on many systems on shared drives.

I am sure there information somewhere in the package documentation on this, but I havent been able to either find it or understand it.

all the best, Jack

I would use a different JULIA_DEPOT_PATH for each platform:
https://docs.julialang.org/en/v1/manual/environment-variables/#JULIA_DEPOT_PATH

1 Like

Thanks @mkitti

We will look at JULIA_DEPOT_PATH

But I am coming to the conclusion that packages are meant to solve something more complicated than a small team of users that are scientists and not professional programmers managing code on heterogeneous platforms. So if JULIA_DEPOT_PATH does not help, will move off packages.

best, jack

If your shared folders contain Manifest.toml files, they are fundamentally incompatible between julia versions. They might sometimes work, but officially Manifests are julia version-specific.
This cannot really be any different in general. Suppose that there is dependency PkgA with PkgA@0.1 only compatible with Julia 1.6, and PkgA@0.2 only compatible with Julia 1.7. Then there would be no possible configuration that works with both Julia 1.6 and 1.7.

Why do you need to use a shared folder at all for in-house packages? It makes sense to treat them similar to other packages and just specify required versions in your project/application dependencies.

1 Like

Thanks @aplavin

That is indeed what we have come to understand. And is seems very sensible indeed.

This came about because we did not understand julia packages. Its very simple if on the same system, and we didn’t think it though. I am now looking at custom registries, but that seems yet another layer of complexity.

If Julia offers the same as R: Keep package source in shared drive, then inside them: library(devtools); document(); install(); every time code changes, and then library(mylibrary) where code is used, it would be fantastic. Maybe its very simple indeed to achieve that, but I haven’t discovered how.

best, jack

I’m not familiar with R packages story, but from this description seems like they are installed globally, for the whole R installation on a computer. Julia Pkg allows independent environments, with completely independent package sets and their versions. This is crucial for reproducibility, eg when after a year you want to rerun some analysis and get the exact same results.

A custom registry (using LocalRegistry.jl) is easy to set up and, most imporantly, use: just using LocalRegistry; register() to register a new package version. Unlike the General registry, it’s instant: the new version is immediately available to all users of the registry.

2 Likes

Thanks @aplavin

we will try LocalRegistry.jl, thanks for the tip

100% agree on reproducibility. We use docker images for the milestone analysis for that reason. Can see why this is even a better way.

best, jack

I think you should strongly consider using packages. The reason is that a significant part of the Julia precompilation cache is dependent on packages and their UUIDs. Effectively the package is the unit of precompilation for Julia. This compilation cache lives within .julia/compiled. Other configuration items live within .julia as well. Some platform specific code lives within .julia/artifacts.

Package source does not need to live within the Julia depot or .julia/dev specifically. A proper Project.toml and Manifest.toml can refer to packages anywhere on the system by using Pkg.add. Given a Project.toml and Manifest.toml you should be able to reproduce a Julia environment. A local registry is also a good idea, but I think it would be mainly a convenience.

The general documentation on the Julia package manager can be found in the Pkg.jl manual.

https://pkgdocs.julialang.org/v1/

A description of the contents of the Julia depot can be found with the documentation for the Julia variable DEPOT_PATH.

https://docs.julialang.org/en/v1/base/constants/#Base.DEPOT_PATH

The relationship between code loading, packages, and environments is described below.

https://docs.julialang.org/en/v1/manual/code-loading/#code-loading

1 Like

Thanks @mkitti

I fully agree with all you said, but lost the battle with my colleagues on this yesterday. I hope its OK if I tell you why.

It has to do with our lack of understanding of Julia packages, but also how we do versions and backup of code. All code (and data and analysis) lives on several ZFS with frequent snapshots to our remote disk and rsync.net. We have one git repo for each part of our code. No need to use githup or gitlab or some git server, all is local. This also means we never push/pull. All we do is commit. We are fully aware of the pros and cons of this setup but it works for us. And a big advantage is simplicity.

Also, we could not care less about precompilation speed. Our code runs from crontab and can take hours or days. Startup speed is irrelevant.

So the julia page on packages is 5. Creating Packages · Pkg.jl We had made packages using what it calls " the minimal pkg> generate functionality" which is simple and works well, but its recommendation “We recommend that you use PkgTemplates” seems to require githup/gitlab or some repository with push/pull, it does not work with a local git repository that does nor need push/pull.

LocalRegistry.jl is not compatible with our setup either, it needs git push/pull

So, the reason my team rejected using packages is because of how they depend on git being used in a way that is both complicated and incompatible with our very simple setup.

It is quite possible we are wrong, but I was told in no uncertain terms by my colleagues yesterday that they had no interest in spending hours understanding the package system to solve our problem when using only modules work.

best, Jack

1 Like

I’m not sure what else to say other than good luck. My concern is not about your colleague’s understanding of proper code organization. My concern is there may be a fundamental misunderstanding of how Julia works.

How are you loading modules? Are you invoking using SomePackage or import AnotherPackage? or… are you just doing include("somepackage.jl")?

OK, so you are actually using packages. That at least adds uuid and version fields to your Project.toml for each package.

PkgTemplates

This is false. The entire configuration is entirely customizable. You can create a template exactly once and reuse it.

julia> using PkgTemplates

julia> template = Template(interactive=true)
Template keywords to customize:
[press: d=done, a=all, n=none]
   [ ] user
   [ ] authors
   [ ] dir
   [ ] host
   [ ] julia
 > [X] plugins

You can activate or deactivate git, GitHub, or GitLab.

Select plugins:
[press: d=done, a=all, n=none]
   [X] CompatHelper
   [X] ProjectFile
   [X] SrcDir
   [ ] Git
   [X] License
   [X] Readme
   [X] Tests
 > [X] TagBot
   [ ] AppVeyor
   [ ] BlueStyleBadge
   [ ] CirrusCI
   [ ] Citation
   [ ] Codecov
   [ ] ColPracBadge
   [ ] Coveralls
   [ ] Develop
   [ ] Documenter
   [ ] DroneCI
   [ ] GitHubActions
   [ ] GitLabCI
   [ ] PkgEvalBadge
   [ ] RegisterAction
   [ ] TravisCI
CompatHelper keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] cron
   [ ] destination
   [ ] file
ProjectFile keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] version
   [ ] None
SrcDir keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] destination
   [ ] file
License keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] destination
   [ ] name
   [ ] path
Readme keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] badge_off
   [ ] badge_order
   [ ] destination
   [ ] file
   [ ] inline_badges
Tests keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] file
   [ ] project
TagBot keywords to customize:
[press: d=done, a=all, n=none]
 > [ ] branches
   [ ] changelog
   [ ] changelog_ignore
   [ ] destination
   [ ] dispatch
   [ ] dispatch_delay
   [ ] file
   [ ] gpg
   [ ] gpg_password
   [ ] registry
   [ ] ssh
   [ ] ssh_password
   [ ] token
   [ ] trigger

Pkg.develop

If seems that you are using git at some point, so that should make things easier. However, you can use packages without using git at all by using Pkg.develop or ]dev with a local path.

(SomeJuliaPkg) pkg> generate YetAnotherJuliaPackage
  Generating  project YetAnotherJuliaPackage:
    YetAnotherJuliaPackage/Project.toml
    YetAnotherJuliaPackage/src/YetAnotherJuliaPackage.jl

(SomeJuliaPkg) pkg> dev ./YetAnotherJuliaPackage
   Resolving package versions...
    Updating `~/SomeJuliaPkg/Project.toml`
  [ff33dcb1] + YetAnotherJuliaPackage v0.1.0 `YetAnotherJuliaPackage`
    Updating `~/SomeJuliaPkg/Manifest.toml`
  [ff33dcb1] + YetAnotherJuliaPackage v0.1.0 `YetAnotherJuliaPackage`

julia> using YetAnotherJuliaPackage
[ Info: Precompiling YetAnotherJuliaPackage [ff33dcb1-3271-4990-a66c-63e68f98716f]

For example, I actually recommend using the develop mode rather than the add mode in your case so that the state of the code on disk is what will be loaded when invoking using or import.

In a subsequent julia session, notice that no precompilation occurs:

julia> using YetAnotherJuliaPackage

If before loading the package, a source file has changed, then precompilation will occur if you are using Pkg.develop or ]dev:

julia> Base.Filesystem.touch("YetAnotherJuliaPackage/src/YetAnotherJuliaPackage.jl")
"YetAnotherJuliaPackage/src/YetAnotherJuliaPackage.jl"

julia> using YetAnotherJuliaPackage
[ Info: Precompiling YetAnotherJuliaPackage [ff33dcb1-3271-4990-a66c-63e68f98716f]

Also after this precompilation has been done, there will be a .ji file in .julia/compiled/v1.7/YetAnotherJuliaPackage containing type inferred code cached during the precompilation step. I believe the Julia devs were smart enough to make the .ji files platform independent. Yet there are still some configuration issues between platforms that I would still recommend using a distinct JULIA_DEPOT_PATH environment variable for each platform.

julia> readdir(joinpath(DEPOT_PATH[1], "compiled", "v1.7", "YetAnotherJuliaPackage"))
1-element Vector{String}:
 "92CAK_9xKaS.ji"
3 Likes

Thanks @mkitti for the patience, much appreciated.

[/quote]

You are spot on correct here, we are all scientists not programmers and I would not be surprised if we had a fundamental misunderstanding of how Julia works. While Julia is a much better language than what we used before, we found the package documentation hard to navigate, or at least hard to see the benefit.

We used for packages (using the Pkg.develop)
Pkg.activate(“code/package/”)
using package

Works well on a single machine. The problem we had here was how to deal with packages and code being on a shared folder across heterogenous systems. My question at the start of this discussion was how to solve that problem.

and with modules
push!(LOAD_PATH,“code/package/”)
using package

And the last just works. zero issues, dead simple.

best, jack

1 Like

Try ]dev code/package. This will persist the path in thr Manifest.toml so you can skip the LOAD_PATH manipulation everytime.

3 Likes

Thanks @mkitti

Will do.

Thanks for the patience. And having read thought all the fantastic comments above, I’ve learned a lot! and will make a determined efforts to understand the details of packages, and hopefully put into use in our project.

best, jack