Using Julia on campus computers w cloud file storage

How do people who teach Julia in the classroom reconcile Julia’s caching of packages in user-specific $JULIA_PKGDIR directories (e.g. ~/.julia) with the lack of user-specific local storage on typical classroom computers?

I’ll be teaching an undergrad numerical methods course with Julia this fall. I’d like to the students to run Julia on the classroom computers (WIndows machines). But local disk space on these machines is nonuniform, unreliable, and unprotected, and the university strongly encourages students to store all their files on Box (the university’s cloud file-storage solution). I suppose I can tell students to sit at the same machines every day, find some local disk space, point their JULIA_PKGDIR there, and rebuild their packages if they get wiped out. But this is not appealing solution.

I’d have them use JuliaBox except for the 500 MB disk space limit. I couldn’t even complete Pkg.add("Plots") on JuliaBox a few weeks ago. It reached the disk quota and crashed.

It looks like you can mount Box storage as a network drive like this and have it automatically reconnect at login. If you have all your student do that an then set JULIA_PKGDIR to somewhere on the mapped drive, that could work (I assume student accounts can set their own environment variables on Windows, although I don’t know for sure).

If this works, could you do a quick measurement of package load times? I’m curious of the network will be a bottleneck.

JuliaBox quota should be increased to 1GB soon, and we do larger quotas for custom environments for specific classes or universities.

The size of the Plots package is an unfortunate artefact… there are no easy solutions, but I do wish it could be fixed.

Regards

Avik

1 Like

When using cloud storage to host the Julia package directory, you might run into similar problems as recently discussed in this thread:

Because I guess cloud storage has similar (or worse) latency as NFS, which makes operating on many small files, like git does, painfully slow. One solution that was suggested in that thread is to use this package

for speeding up the Pkg operations. Though personally I haven’t tried that out.

1 Like

It will be fixed by Pkg3 since we’ll have to re-tag everything. Essentially, the problem which effects Plots.jl and DifferentialEquations.jl in a big way is that they used to have their documentation (which included animations) as part of the repository, along with IJulia notebooks. This was found to bloat the size of the repo, so these were subsequently moved to separate docs-only repos within the orgs. However, deleting these things from the code doesn’t actually delete it from the repository because it stays in the Git history (which is what really causes the bloat in the first place), so now these repos have 100+MB repos almost entirely due to the .git history. You can clean these histories by using a repo cleaner, but that will break every tag in METADATA. So instead we have to wait until Pkg3 makes a new METADATA, and then these repos (and any other repo which had the same issues) drop in size by 100MB.

Moral of the story is that if you plan on continued development, put the docs in a separate repo, and when the new METADATA comes repo cleaners will be applied to these big libraries and this will fix the size issues for good.

But sorry @John_Gibson that won’t help for the fall. But it does suggest that another solution would be to copy (but not fork! you don’t want the git history) the Plots.jl src, deps, and test to a new repo and have students clone that. In that case, you’d get the “same repo” but it will be only 3MB. You wouldn’t get any updates unless you manually pull them in, but maybe it’s helpful to just solidify a version for teaching purposes anyways. If you do this, make sure you acknowledge the original repo in the license.

4 Likes