Installing packages via an SSH SOCKS proxy on a compute cluster

I have spent the better part of a day figuring this out, so I thought it might be interesting to share for others:

In my case, I would like to run Julia on a university cluster with very restricted internet access from inside. That is, most websites and services are not directly reachable from the login nodes (I am not even talking about the compute nodes). This makes it impossible to, e.g., just run Julia and install packages using Pkg, since the package servers and other locations are not available.

If you are using a Linux client, witha fairly recent OpenSSH version (>= 7.6, i.e., post-2017), and have SSH access to the remote machine, here is one possible solution:

First, connect to the cluster via SSH and enable reverse dynamic forwarding with

ssh -R 12345 mycluster

This will create a SOCKS4/5 proxy on the remote machine mycluster and have it listen on port 12345. All connections originating on the the remote machine that connect to this proxy will then be forwarded to the internet via your local machine.

To make Julia use the proxy, you need to set the environment variables HTTP_PROXY and HTTPS_PROXY to use the SOCKS proxy, e.g., by starting Julia as

HTTP_PROXY=socks5://localhost:12345 HTTPS_PROXY=socks5://localhost:12345 julia

Now, all normal package operations should work as usual.

In case you want to make this setup permanent, you can add an entry to your local clients ~/.ssh/config file:

Host mycluster
  RemoteForward 12345

This will save you from typing the -R 12345 each time.

You can instruct Julia to always use the SOCKS proxy by adding the following lines to your ~/.julia/config/startup.jl file on the remote machine, saving you from setting the environment variables at each start:

ENV["HTTP_PROXY"] = "socks5://localhost:12345"
ENV["HTTPS_PROXY"] = "socks5://localhost:12345"

If someone other than you already uses the hardcoded port, you need to override it again by providing the respective arguments on the command line.

Thanks to Install packages behind the proxy for their discussions on how to set and use proxy servers with Julia.
Thanks to @vchuravy @giordano @simonbyrne for their helpful suggestions and discussions.

11 Likes

Hi!
I tried doing the same on 1.8.0-beta3 and I’m getting the following error:

ERROR: failed to clone from GitHub - JuliaRegistries/General: The official registry of general Julia packages, error: GitError(Code:ERROR, Class:HTTP, unknown http scheme ‘socks5’)

I assume that the git version does not support socks5, although the system git actually can do socks5 (tested manually).

Is there any way to work that around?

Then you can try setting the environment variable JULIA_PKG_USE_CLI_GIT=true

3 Likes

@giordano thanks a lot, using the system git works perfectly!

1 Like

This worked for me too, thanks for sharing!

You’re welcome! In the meantime, I found an even better approach for specifying the HTTP/HTTPS proxies: Instead of using the environment variables HTTP_PROXY and HTTPS_PROXY, I recommend to set the lowercase variants http_proxy and https_proxy, e.g., in your ~/.bashrc file:

export http_proxy=socks5://localhost:12345
export https_proxy=socks5://localhost:12345

This works with Julia and Pkg, but as an added bonus it also works with curl! For example, to download the latest stable Julia version, you can now just execute

curl -OJL https://julialang-s3.julialang.org/bin/linux/x64/1.8/julia-1.8.1-linux-x86_64.tar.gz

Unfortunately I cannot edit the original post anymore, but I thought I’d share this here anyways.

1 Like

Note that in some machines without even DNS access, one will need

export http_proxy=socks5h://localhost:12345
export https_proxy=socks5h://localhost:12345

which allows resolving domain names via socks5.

3 Likes