Pkg, private registries and corporate firewalls?

Hi,

At work we’re behind a firewall, we build Julia code into docker images, and we now have a private registry and some packages on Bitbucket (thanks to LocalRegistry.jl!)

This combination, along with some documented (but as far as I can tell not yet fixed?), issues with Libgit2 make for almost more fun than I can cope with, and it would be nice to get a review of my approach.

I have a very shaky working solution, which relies on on installing ssh into the base Julia image and using it to get an authorized key for bitbucket and to run ssh-agent.

I have a couple of questions related to this particular way of doing things below, but mainly I’d like to check that I’m not missing better ways of achieving the same thing?

For the way I’m doing it now, it would be nice to tell Libgit2 the equivalent of

Host altssh.bitbucket.org
    StrictHostKeyChecking no

But it doesn’t read ~/.ssh/config. Is there a way of passing this kind of configuration information in or a better way to do this?

It would also be nice to point Libgit2 to the the key file rather than start ssh-agent but I can’t get

env SSH_KEY_PATH=/root/.ssh/id_rsa julia "using Pkg; Pkg..."

to work at all. Any pointers on that?

Thanks!
Geoff

If it’s helpful, an only-just-working docker file is something like this:

FROM julia:latest
COPY id_rsa /root/.ssh/id_rsa
RUN chmod go-rwx /root/.ssh/id_rsa \
  && echo "Host altssh.bitbucket.org\n\tStrictHostKeyChecking no\n" >> /root/.ssh/config \
  && apt-get update && apt-get install -y ssh

WORKDIR /home
COPY Project.toml Project.toml
COPY src/ src/

# Run ssh just to get an authorized key
# I believe I could also `ssh-keyscan altssh.bitbucket.org:443 >> /root/.ssh/known_hosts` but that might be interactive?
RUN ssh -T -p 443 git@altssh.bitbucket.org \
# Start ssh-agent because libgit2 doesn't find the key by itself
  && eval "$(ssh-agent -s)" \
  && ssh-add /root/.ssh/id_rsa \
# `dev` our package before `instantiate` by providing the URL, because the registry URLs aren't ssh URLs
  && julia --project=@. -e"using Pkg; 
Pkg.develop(PackageSpec(url=\"ssh://git@altssh.bitbucket.org:443/package/Url.jl.git\")); Pkg.instantiate(); try; Pkg.precompile(); catch e; end"

A couple of things to note:

  • that’s not a great way of passing the key in, but that’s not my main issue at the moment.
  • our firewall blocks 22 so we have to connect to Bitbucket through altssh on 443.
  • I’m doing the Pkg.develop because currently our private registry contains https URLs. I haven’t managed to get the ssh-agent key-finding trick for Libgit2 working on Windows, so outside of docker we need we have to use https URLs for Pkg to work. Hopefully a later version will be:
julia --project=@. -e"using Pkg; Pkg.Registry.add(RegistrySpec(url="registry_url")); Pkg.instantiate(); ...

It sounds like this item from the Julia 1.7 News file may be of interest to you:

  • It is now possible to use an external git executable instead of the default libgit2 library for the downloads that happen via the Git protocol by setting the environment variable JULIA_PKG_USE_CLI_GIT=true .

Julia 1.7 has not been released yet but you might want to try out the functionality with a Julia nightly build.

However, what I would really recommend is to run an internal package server inside your firewall. I wrote the LocalPackageServer package specifically to simplify the access to packages in a private registry. We’re running it with great success at work, and although we don’t have a firewall that blocks anything on the outside it helps a lot with reducing the need for ssh keys to our bitbucket server.

2 Likes

Oh, LocalPackageServer looks handy.

Does the package server sit between the git repos and the “user”, or does it forward the user to the git repos? If it does the former, it could fix most of the problems we’re having?

Can I run LocalPackageServer on a container inside our firewall, and have our users, containers and CI jobs connect to it, and then save them (particularly the containers and CI) from connecting to private repos?

Presumably, whatever LocalPackageServer is running on needs keys for the repos, and ssh-agent running? But that would be much nicer that having those keys in every container.

Thanks!

The former and yes.

Can I run LocalPackageServer on a container inside our firewall, and have our users, containers and CI jobs connect to it, and then save them (particularly the containers and CI) from connecting to private repos?

Yes, that’s very much the idea.

Presumably, whatever LocalPackageServer is running on needs keys for the repos, and ssh-agent running? But that would be much nicer that having those keys in every container.

Yes, LocalPackageServer itself needs keys for the repos. It only needs read-only access though, so bitbucket’s Access keys are ideal for this. I’ve never used ssh-agent but LocalPackageServer uses an external git client so the normal conveniences of a command line git are available. Libgit2 is not used at all for this.

By the way there’s not a whole lot of code involved in getting the local packages. You can check it out for yourself here.

Thanks Gunnar, this sounds perfect!

Hi,

Just a follow up on this. I haven’t got LocalPackageServer.jl deployed yet, but I have it working in docker locally (and behind the firewall).

It all went really smoothly, thank you, except that Project.toml selects particular versions of packages, and, for whatever reason, our proxy server doesn’t co-operate with HTTP.jl 0.8, but does with HTTP.jl 0.9.
So I’ve stripped that out, and, so far, everything is working fine.

Here’s the docker file. Some notes on that:

  • I’m using julia:alpine rather than julia:latest. Latest is Debian Buster which by default uses git 2.20 (vs 2.32) and 2.20 had trouble with the proxy
  • I clone LocalPackageServer.jl /home rather than add it. I’m not sure if you can reliably work out where a package will be in order to run bin/run_server.jl. (Something like ~/.julia/packages/LocalPackageServer/<hash??>/ but I don’t know the hash)
FROM julia:alpine
WORKDIR /home

# Install git and ssh (which LocalPackageServer needs all the time, not just for install)
RUN apk add --no-cache openssh git

# Get bitbucket into known_hosts (that's where our registry is)
RUN mkdir /root/.ssh && ssh-keyscan -p 443 altssh.bitbucket.org >> /root/.ssh/known_hosts

# Install LocalPackageServer, stripping out the "compat" section of Project.toml
RUN git clone --depth 1 https://github.com/GunnarFarneback/LocalPackageServer.jl.git \
  && cd LocalPackageServer.jl \
  && mv Project.toml Project.old.toml \
  # Strip out the compat section of Project.toml (there must be a better way!)
  && cat Project.old.toml | grep -v -E "[[:alnum:]]+ = \"[[:digit:]]\.[[:digit:]]\"" > Project.toml \
  && julia --project -e"using Pkg; Pkg.update(); Pkg.instantiate(); try; Pkg.precompile(); catch e; end"

COPY deploy_key_ed25519 /root/.ssh/id_ed25519
RUN chmod go-rwx /root/.ssh/id_ed25519

# Get our config
COPY config.toml config.toml

CMD cd LocalPackageServer.jl && julia --project bin/run_server.jl /home/config.toml

It doesn’t select particular versions but states what it’s compatible with and HTTP 0.9 is a breaking upgrade from HTTP 0.8. I haven’t managed to find what breaking changes happened in HTTP 0.9 but it passes the tests and apparently it works for you, so I’ve added 0.9 to the HTTP compat in Update compat of HTTP and LocalRegistry. by GunnarFarneback · Pull Request #4 · GunnarFarneback/LocalPackageServer.jl · GitHub.

I’m using julia:alpine rather than julia:latest . Latest is Debian Buster which by default uses git 2.20 (vs 2.32) and 2.20 had trouble with the proxy

At some point I’ll probably switch to using the Git package instead of relying on an external git but considering your need for a cutting edge version it seems potentially limiting not having an external git option.

I clone LocalPackageServer.jl /home rather than add it. I’m not sure if you can reliably work out where a package will be in order to run bin/run_server.jl. (Something like ~/.julia/packages/LocalPackageServer/<hash??>/ but I don’t know the hash)

You can find the path from inside Julia with pathof but that’s not all that useful here. The deployment story can be improved but just cloning the package is fine.

Thanks again Gunnar,

Sorry to keep bugging you, but one more question: what’s your LocalRegistry/LocalPackageServer workflow?

As far as I can tell (and I might have got this wrong), if I am using JULIA_PKG_SERVER=http://my.pkg.server then I can’t register new versions of packages to the LocalRegistry backing the server, so I’ve had to switch between that and registry add local.registry.git.url.

I’m probably doing something wrong, right?

Cheers,
Geoff

1 Like

To use register you indeed need to have a git clone of your registry.

The simplest way to achieve this in combination with using a local package server is to just remove your registry and add it back with a git url. Packages will still be fetched from the package server but the registry will be fetched from git.

An alternative approach is to clone your registry somewhere else than in ~/.julia/registries and use the registry keyword argument to register to point to it.

It’s in my plans to simplify the latter approach, see the first item of Towards 1.0 - breaking changes · Issue #30 · GunnarFarneback/LocalRegistry.jl · GitHub.

2 Likes