How to setup Julia & Flux with CUDA & cuDNN on a computing cluster?

Hello,

I recently switched to Julia from Python for both Deep Learning and scientific computing. I use a national cluster for my calculations, where Julia 1.0.3 is installed as well as CUDA 10.1.105. I’d like to set up Flux as to use GPUs. I already installed cuDNN locally.

The problem is that it is not possible to install packages from Julia’s package manager on the cluster. It seems that it cannot download the packages from github (problem of permissions?). I managed to install some packages by downloading the tarball from github on my local machine and then uploading it in the right place, but this does not always work.

I wonder if there is a way make the package manager work on a cluster, does anybody have some experience on this?

Thanks.

Check that there is not a proxy in the cluster. In that case, you should have to configurate github for allowing to download, see:

Has the cluster access to internet? If not, that could have a more difficult solution.

Thank you for your answer!
I am not very familiar with proxies. How can I check if there is a proxy in the cluster? I believe it has access to internet though.

Configuring git to use a proxy seems to be easy, but which proxy should I make it use?

Thank you again, and sorry if I am not very familiar with all this.

If you succeed I would like to know your process… we did try it in my institution and the IT guys didn’t manage to install properly CuArrays … so no gpu’s…

Do you have any log files from your attempt on the cluster?

Nothing really helping in .julia/logs. Here is the error message:

ERROR: failed to clone from GitHub - JuliaRegistries/General: The official registry of general Julia packages, error: GitError(Code:ERROR, Class:Net, curl error: Failed to connect to github.com port 443: Connection timed out)

Then, as suggested by others, it’s probably proxy-related (the port is blocked, assuming there is access to the internet). The easiest is to simply contact the system admins and configure your .gitconfig file accordingly.

Hello @Massimiliano_Comin I agree that there may be a proxy configured on your cluster. As @balinus says just ask you friendly system admins.

Also please ask their advice on which filesystem is suitable for your project. Often on an HPC cluster the HOME filesystem has a small storage space per user and is slow. HPC clusters have parallel filesystems which are much more suitable to the task. You shoudl be told where to put your executables/libraries and where to create your data.

Thank you, I will try to contact the system admins to set up the git proxy properly.
As for the file systems, we have indeed a workspace where we can put execs, libs, etc… so it will not be a problem.

The cluster is not connected to the internet, meaning I can only use SCP. How can I install packages then?

Thank you for your help.