Privately hosting BinaryBuilder products with authentication

I’m trying to host some binaries I created with BinaryBuilder on my work’s internal web server. Accessing the server requires authentication.

I can use curl at the command line to download an artifact. By providing my username, I can get curl to prompt me for my password.

curl -O -u dmatz -L SOME_ARTIFACT_URL
Enter host password for user 'dmatz':
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  243k  100  243k    0     0   338k      0 --:--:-- --:--:-- --:--:--  346k

However, when I try to use the JLL package created by BinaryBuilder, with the same artifact URL, I get a 401 response without any opportunity to enter my password.

Is there a way to get Pkg to prompt me for my password? Is anyone else hosting BinaryBuilder products privately?

3 Likes

If you have a netrc file, Pkg (and any other usage of Downloads) will respect it.

Thank you! That did indeed work for me!

Does Pkg support any other methods of authentication besides a password? I’m learning from our sys admins that they will soon have to disable password authentication for all of our servers…

I realized I was suffering from some tunnel vision by only looking at HTTP options. Since I want to host these for internal use, I can just put them on our compute cluster and use SCP or SFTP to download them. That way we can use SSH keys to authenticate.

The issue now is that it is a bit tricky to get curl or Downloads.download to download a file from our server. I can easily download a file using scp:

$ scp my_server:.bashrc test.bashrc
.bashrc                                                          100%  401    10.8KB/s   00:00

I can’t do it with curl like this:

$ /usr/local/Cellar/curl/7.85.0/bin/curl --verbose scp://my_server//home/dmatz/.bashrc
# cutting out some details
* SSH authentication methods available: publickey
* Using SSH private key file '/Users/dmatz/.ssh/id_rsa'
* SSH public key authentication failed: Unable to extract public key from private key file: Wrong passphrase or invalid/unrecognized private key file format
* No identity would match
* Authentication failure
* Closing connection 0
curl: (67) Authentication failure

It seems I have to specify my username to get it to work:

$ /usr/local/Cellar/curl/7.85.0/bin/curl --verbose scp://my_server//home/dmatz/.bashrc --user dmatz:

If I then give Downloads.download on Julia 1.8.2 a try, I find that this doesn’t work:

julia> import Downloads
julia> Downloads.download("scp://my_server//home/dmatz/.bashrc", verbose = true)
# cutting out some details
* SSH authentication methods available: publickey
* Using SSH public key file '/Users/dmatz/.ssh/id_rsa.pub'
* Using SSH private key file '/Users/dmatz/.ssh/id_rsa'
* SSH public key authentication failed: Username/PublicKey combination invalid
* No identity would match
* Authentication failure
* Closing connection 0

Again, it seems it needs my username:

julia> Downloads.download("scp://dmatz@my_server//home/dmatz/.bashrc", verbose = true)

Is there some other option we can add to the libcurl config to get Downloads.download to work a bit better for SCP and SFTP?

How else would you like it to work? Specify the username as a keyword?

It looks like CURLOPT_USERNAME can be used with scp for downloads.

https://curl.se/libcurl/c/CURLOPT_USERNAME.html

You could try defining the host in your SSH config with your username set there for the host. I’m not 100% sure, but I think this should be picked up.

I was thinking the same thing at first. I even have a modified version of Downloads.jl that adds a keyword argument for the username and sets CURLOPT_USERNAME, which does indeed allow me to authenticate with SSH keys. However, I don’t see how this would be useful with artifacts. How would we allow users to set their own username? And for multiple servers?

What I was really trying to show is that SSH authentication in libcurl for some reason requires you to set the username. Yet SSH authentication works fine for ssh, scp, and other command line utilities.

This is also what I’ve been thinking. It seems like libcurl should honor my SSH config, but it doesn’t. Even the curl command line utility isn’t honoring it, so it’s not just something we are configuring incorrectly…

Digging into the libcurl change log, I see that there are quite a few bug fixes related to their interaction with libssh2 in versions that are newer than the one we are currently using. I also see some open issues on Yggdrasil to upgrade to a newer version of libcurl. Perhaps this will work better with a newer version?

It seems like curl doesn’t honor the SSH config file after all. See: curl with sftp does not honor Port in ssh config file · Issue #9285 · curl/curl · GitHub.

We compile curl against LibSSH2

We probably need to refer to the LibSSH2A documentation for an external config file.

Ah, I see. So curl can be compiled with one of several ssh backends. It looks like libssh does support the SSH config file, but libssh2 does not. And, unfortunately, I don’t see any discussion in the libssh2 documentation of an alternate config file.

So, if we want to support SSH key authentication, I guess I’m back to figuring out a way to set CURLOPT_USERNAME in an automated fashion. Should we try to parse the SSH config file ourselves, ignoring most keys, but grabbing things like usernames and SSH key paths?

My overall goal here was to explore how I can host artifacts with authentication. It seems like there’s really only one option so far, which is to use the http method with a username and password in the .netrc file. Are there any other options I’m missing?

Any news about this ? I’m trying to host artifacts on a GitLab self-hosted instance with only ssh access, but so far, I did not find a way to retrieve them…

I’ve been continuing to think about it, and I was hoping to put together a proposal and a PR soon, but I’ve been on paternity leave and just haven’t gotten around to it, yet.

I think we need to add a way for users to specify a configuration for Downloads.jl. We need this because libssh2 doesn’t honor the .ssh/config file, and libcurl doesn’t have a config file of its own (well, there is the --config option and the curlrc files, but they don’t let you specify options based on the host). I found a discussion about SSH config files and the need for a better curl config file on the curl mailing list last year, and that would be wonderful if that happens some day, but it doesn’t help us in the near term.

I think we could keep it pretty simple. For a specified host and protocol, we could support setting the username, password, and paths to the SSH public and private keys. I’m not sure whether it would be with a config file or just a global config object that people can modify in their startup.jl files.

In the meantime, @BambOoxX, I’ve been downloading the artifacts manually and creating a depot that folks can point to using JULIA_DEPOT_PATH. Since they don’t need to download anything, they don’t need to authenticate! This is on our compute cluster with a shared file system.

Correct me if I’m wrong, but in my case, I host the package on our private GitLab, as well as some dependencies. Therefore, both pulling a package and pulling dependencies need the same type of identification to reach the data on the server.
I would have expected both to succeed or fail, but it is not the case. Is it because pulling the package is done via git (which honors) ssh configuration, and pulling the dependencies via curl ?

Right. For git repos, we use libgit2, and for artifacts, we use libcurl.

I generally use an SSH agent to get authentication to work for libgit2. I see that libcurl also supports using an SSH agent, but in practice I haven’t been able to get it to work. I think it’s related to setting the username, which I discussed above. If we could get it to work, perhaps we wouldn’t even need to add a way to configure Downloads.jl.

FYI, it seems that the GitLab API does not support .netrc identification, which of course you cannot do anything about, but is it possible to use other identification processes in julia ? See also How to configure an artifact to use a header during package download? - #2 by BambOoxX

Whatever GitLab is doing is not a standard HTTP authentication mechanism, so I’m not sure what can be done here. They’re using a bespoke header to pass a secret token, which Downloads does allow you to do. What else would you want?

1 Like

I do not want to digress to much here with respect to the original post, but I think what I am lacking in my case is simply how to configure an Artifact during a package development so that Downloads uses this header during package installation for the Artifact download. We can discuss this further in my own post if you prefer.

You don’t configure the artifact, you would add a download hook that looks at the URL for a download and if the host name is gitlab.com, then it inserts the given header. I think that should be possible.

1 Like

Please correct me if I’m wrong but when you say that it looks like something should be configured on the package user side for this solution to work, is that so ?
Also, I have no problem downloading something through Downloads, it is really about being able to use the same approach with artifacts. Basically being able to download manually an artifact should work, I just do not know if it’s possible.