I started to have issues recently downloading some files with Downloads.download(). The issues are not quite deterministic, but they seem to persist for a while. For example, now, this file:
julia> using Downloads
julia> for line in eachline(Downloads.download("https://files.rcsb.org/view/1LBD.pdb"))
println(line)
end
# a lot of lines ending with something broken like:
AT2O
A2.23 6.38 0.00 10.01 O
ATOM 346 CB THR A 45 31.826 -3.087 9.438 1.00 9.68 15.567 1.00 11.03 X179231.8 1.00 11.03 X1792319.438A 2 A 42.90 1.00 17.46 C
AT1.00 CB 5417 CB 5417 CB 5417 CB 5417
where the output should end with something like:
ATOM 1869 CG2 THR A 462 -27.063 71.965 49.222 1.00 78.62 C
ATOM 1870 OXT THR A 462 -25.379 71.816 51.613 1.00 84.35 O
TER 1871 THR A 462
MASTER 369 0 0 12 2 0 0 6 1870 1 0 22
END
These issues come and go, for specific files, thus I’m quite uncertain how to debug them, or if the issue is in Downloads, in the file server, or a mixture of both. Any idea on what to do?
julia> versioninfo()
Julia Version 1.11.3
Commit d63adeda50d (2025-01-21 19:42 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 20 × 12th Gen Intel(R) Core(TM) i7-12700F
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
Threads: 20 default, 0 interactive, 10 GC (on 20 virtual cores)
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS = 20
JULIA_PKG_PRESERVE_TIERED_INSTALLED = true
Exactly now I’m able to replicate it with any Julia version (from 1.6.7 to 1.11.3).
and in two completely different machines (except they are in the same university).
This is what I get with verbose=true, if anything here means anything to someone:
julia> for line in eachline(Downloads.download("https://files.rcsb.org/view/1LBD.pdb"; verbose=true))
println(line)
end
* WARNING: failed to open cookie file ""
* Couldn't find host files.rcsb.org in the .netrc file; using defaults
* Found bundle for host: 0x20cab790 [can multiplex]
* Re-using existing connection with host files.rcsb.org
* [HTTP/2] [7] OPENED stream for https://files.rcsb.org/view/1LBD.pdb
* [HTTP/2] [7] [:method: GET]
* [HTTP/2] [7] [:scheme: https]
* [HTTP/2] [7] [:authority: files.rcsb.org]
* [HTTP/2] [7] [:path: /view/1LBD.pdb]
* [HTTP/2] [7] [accept: */*]
* [HTTP/2] [7] [user-agent: curl/8.4.0 julia/1.10]
> GET /view/1LBD.pdb HTTP/2
Host: files.rcsb.org
Accept: */*
User-Agent: curl/8.4.0 julia/1.10
< HTTP/2 200
< content-type: text/plain;charset=UTF-8
< date: Tue, 18 Feb 2025 18:33:27 GMT
< server: Apache
< strict-transport-security: max-age=16000000; includeSubDomains; preload;
< vary: Accept-Encoding
< x-cache: Hit from cloudfront
< via: 1.1 24d78a59d537e88caa95647eaadfe050.cloudfront.net (CloudFront)
< x-amz-cf-pop: GRU1-P4
< x-amz-cf-id: Rn-_T-kkeDTmPc4Idw-tqTNhUi27upDO2q3fLPXnxwdIQH3RHkXA6Q==
< age: 1188
< vary: Origin
<
* Connection #0 to host files.rcsb.org left intact
HEADER HYDROLASE 27-JUL-04 1W4P
TITLE BINDING OF NONNATURAL 3'-NUCLEOTIDES TO RIBONUCLEASE A
# etc... at the end the file is broken
edit: at the same time, the unit tests of a package that depend on that exact download passed on CI right now. But I was having random issues like that in recent CI runs as well, so I do not think it is something specific to my network.
I just confirmed (again) that I do get these errors in eventual CI runs, of tests that depend on that download. And by running again the test, it may pass.
For future reference, now I’m not getting the error here anymore. This test should error if the problem occurs again, and tries to reproduce the problem with different files from the protein data bank server, where the issues have occurred:
julia> using Downloads
julia> function test_download()
for pdb_id in ("1lbd", "1bsx")
for format in ("pdb", "cif")
nlines = zeros(Int,2)
for (isource, source) in enumerate(("view", "download"))
nlines[isource] = 0
for line in eachline(Downloads.download("https://files.rcsb.org/$source/$pdb_id.$format"))
nlines[isource] += 1
end
end
@test nlines[1] == nlines[2]
end
end
end
test_download (generic function with 1 method)
julia> test_download()