Julia Hangs After Attempt at Pkg.add("") in Linux Cluster

Hi all,

I’m not new to Julia, but I’m certainly new to running it in a Linux cluster. I’m a neuroscientist and recently needed to run some computation heavy simulations written in Julia in my school’s Linux cluster. The cluster didn’t have Julia available as a module, so I had to download the 64-bit Linux binary to my personal environment in the cluster.

From Julia’s command line interface in the cluster, I attempted to run

Pkg.add(“Plots”)

To my surprise, there was no print output from Julia as there usually is; I left Julia running for about an hour, and when I returned, there was still no print output and I was prevented from executing any further commands, indicating that Pkg.add was still running.

I thought perhaps this was a fluke, so I logged out of the cluster and logged back in and attempted to run

Pkg.add(“Juno”)

and got the same result as before. I have made several more attempts, each with the same result: Julia appears to just hang, not actually carrying out the procedure (or at least not carrying it out in an appropriate time span).

Is there something I’m missing? Are there special requirements for downloading packages in clusters or to binary distributions of Julia? A search on Google, this forum, and on GitHub haven’t revealed any possible troubleshooting tips. Any help is much appreciated.

This is a fairly well known problem. Hopefully it will be fixed in a few days when Pkg3 is released, but in the meantime, this has helped me:
https://github.com/KristofferC/Pkg25.jl

I think that the problem is that, when the package system is trying to work out how to satisfy the dependencies of a new package, it must read many small files in the Metadata repo. Normally this is fine, but on clusters (or other systems with remote file storage), it takes a relatively long time to access a file, and so Pkg appears to hang.

2 Likes

To add to this: if you’re using NFS, Pkg (really, METADATA access) is horribly slow. Multiple accesses to small files over NFS mounts are the source of lots of delays.

2 Likes

Also check if you can actually download from the site (e.g. using wget). Sometimes (nodes on) clusters do not have full internet access. I have seen the problem of programs hanging indefinitely on a cluster when attempting to download files anyway (probably due to network settings). Solved by doing the downloads on a machine with full access and copying to the cluster first.

1 Like