Hi. I’m learning Julia for the first time. I’'m looking for a way to say in my program “If you don’t already have this package, then get it, but if you do, then just use it”. The issue is that if I just put Pkg.add(“DataStructures”), then it seems to re-load that package from the git repo on every execution (which is time-consuming). I just want it to load the package from cache or something, but if it isn’t there, then I want it to load it from the git repo. I assume this is a common issue. How do people get the “Only add if we don’t already have package” functionality in an easy way?
if this for your script and you just want to see if a package exists or not, How to test if package is installed - #2 by fengyang.wang
if you’re making a module with optional dependency, maybe you can try Requires.jl
Define a project and let the resolver deal with it. See
https://julialang.github.io/Pkg.jl/dev/creating-packages/
but the whole manual is worth reading.
Even with Projects, for distributing programs it would be nice if there was a way of checking whether the project has been instantiated yet (I’m actually thinking about the eglot-julia
package I’ve been semi-working on). Do you know of a good way of doing this?
Just running Pkg.instantiate()
every time isn’t a good solution since it makes network calls to update the registry.
I am not aware of a recommended way (a lot of nuts and bolts for distributing Julia programs are WIP), but I would either
-
just make it part of an install script,
-
check for the depot being populated (since I would presumably have the program in its own depot).
-
check for the existence of a file which
touch
should just create when instantiating.
All of these are hacks though
Could you elaborate on how and why you use depots other than the default? The only reason I can come up with is as a hack to avoid precompiling when switching environments. I haven’t used anything but the default depots yet and hadn’t planned on doing so for the eglot-julia integration package.
How:
Setting the JULIA_DEPOT_PATH
environment variable is a convenient way for me (Linux). There might be others
Why:
When I do it, it’s usually to test whether something (a global thing, like a workflow; not just a given project) works in a pristine environment.
I once tried to maintain a specific depot, dedicated to providing the very specific versions of packages needed for LSP-related stuff to work. That experiment ended relatively uickly (mostly because of LSP stuff not being stable enough for my taste; not because it was a bad idea to have a depot for this).
There might very well be other good reasons to have specific depots.
Not sure if it’s useful / clean, but I have this function in some scripts that I use to setup new environments:
tryusing(pkgsym) = try
@eval using $pkgsym
return true
catch e
return e
end
julia> tryusing(:LinearAlgebra)
true
julia> if tryusing(:Foo) !== true
println("Foo is not installed")
end
Foo is not installed
This is more or less what I ended up doing.
Yes, I just saw that. Sorry for the noise
If I understand your question correctly, Pkg.add("DataStructures")
already does what you request automatically.
The packages used by any of your projects are shared in the depots (e.g the “~/.julia” folder) so no unnecessary duplication occurs. If someone runs Pkg.add("DataStructures")
and the package version already exists in your filesystem, Pkg
will just use the existing package.
All that is true, but this primarily aims at avoiding unnecessary filesystem space usage. But I don’t think it to be the main concern here: when Pkg.add
is issued, there is a lot more going on:
- the registry is updated
- if a new version of the package is available, this new version is downloaded and installed [EDIT: this is not true; see below]
- manifest / project files are updated [EDIT: if needed]
All this takes time, especially when network access is needed. A more minimalistic way of doing things would be:
- if the package is already listed in Package.toml → do nothing
- otherwise, if the package is already installed somewhere in the system → just update Package.toml + Manifest.toml
- otherwise, update the registry, find the newest version available and install it
I might read it wrong, but I guess that is more the spirit of this thread. Doing things this way is probably not advisable in general, especially since you may end up with a very old version of your dependency. But there are cases where this would be useful. The situation described by non-jedi
is a good example of that IMO. And the solution used in eglot in the post linked above implements point 1. in my list.
Since the second bullet is wrong, the last bullet is also wrong. What we could change is perhaps to move the updating of registries later.
See also Offline mode by fredrikekre · Pull Request #1265 · JuliaLang/Pkg.jl · GitHub which basically implements your suggestion.
Ah, sorry, I didn’t know. Why is it needed for Pkg to update the registries, if not in order to find whether a new version of the package is available?
I might not have been clear: what I meant to say in this bullet is that, in general, Pkg.add
updates Project.toml
in order to list the new dependency, and also updates Manifest.toml
. That’s the whole point of Pkg.add
, right?
But yes, of course, if the package was already listed as a dependency and no new version was installed, no update is done to these files.
It is not needed in the case where the package is already installed, thats why I suggested this
Well sure, but how else should we record that we updated the dependencies?
By “later”, do you mean:
- “later within the series of operations performed by
Pkg.add
”, or - “later when it will be needed by another
Pkg.add
orPkg.update
command” ?
Again, sorry if I’m not clear: of course it’s the whole point of Pkg.add
to manage dependencies and record what’s done in Project.toml
and Manifest.toml
. And I’m very happy with it (and I venture to say everybody is ).
The only issues I had were with my first two bullet points. To be clear: I happily use Pkg.add
in order to get the benefit of the third bullet point (update Project.toml
), and I would be even more happy if this operation involved as few side effects as possible.
Out of the two side effects I had issues with, it turns out that:
- updating the registry is not really needed and there is already work under way to avoid it when possible → thanks !
- there is actually no package update triggered and I didn’t have the problem I thought I had.
So everything is fine Thanks for your work on Pkg
, and sorry if I contributed to spread misconceptions about it
I mean that we can do the check if the package exist before updating the registry.
and not update the registry if the package exists?
This would be great! Is that implemented in the PR you mentioned above? If not, do you think it could be implemented in a new PR, by someone (like me) who does not know (yet!) the internals of Pkg
?
Yea.
No its not, and yea sure, should not be that hard.
OK, I’ll add it to my TODO list, then.
Pkg.add(“DataStructures”) takes way too long to be run every time the program starts. Here’s an example:
[andromodon@yogie ~]$ time julia -e ‘using Pkg; Pkg.add(“Luxor”)’
Updating registry at ~/.julia/registries/General
Updating git-repo https://github.com/JuliaRegistries/General.git
Resolving package versions…
Updating ~/.julia/environments/v1.2/Project.toml
[no changes]
Updating ~/.julia/environments/v1.2/Manifest.toml
[no changes]
real 0m33.726s
user 0m2.626s
sys 0m0.387s
33 seconds!
What I’m wanting is a want is a use_and_add_first_if_needed (I welcome better names. ) Ideally, “using” would just add automatically, or maybe there would be a flag that we could apply to julia to auto-add packages that are missing. Or, if there was a way to make Pkg.add take a few miillisconds that would work too. I feel like it should just return right away (Oops, nothign to do… we already have that), and wait for an Pkg.update to actually do network calls and update repos, etc.