Downloads.download: ERROR: Operation too slow. Less than 1 bytes/sec transferred

On Julia 1.6.6 I am constantly getting following ERROR at various stages (sometimes after ~500, sometimes after ~1500 files being downloaded) when trying to download a few thousand files:

ERROR: Operation too slow. Less than 1 bytes/sec transferred the last 20 seconds while requesting https://address

when executing:

for i in j
Downloads.download(url, output; timeout = 90)
end

I have never encountered anything like this before. It seems that the server with the source files is operating without any problems.

My questions are: How can I increase “the period of 20 seconds” to a larger value? Is it possible at all? Would you have any recommendations related to this problem?

Below I am providing more verbose info:

Downloading files: 83%|██████████████████████████▉ | ETA: 0:00:11

  • Found bundle for host server.name: 0xcab63d0 [serially]
  • Can not multiplex, even if we wanted to!
  • Re-using existing connection! (#15) with host server.name
  • Connected to server.address (server.ip) port 443 (#15)

GET /cgi-bin/file HTTP/1.1
Host: server.address
Accept: /
User-Agent: curl/7.73.0 julia/1.6

  • Operation too slow. Less than 1 bytes/sec transferred the last 20 seconds
  • Closing connection 15
    ERROR: LoadError: Operation too slow. Less than 1 bytes/sec transferred the last 20 seconds while requesting https://server.address/cgi-bin/file
    Stacktrace:
    [1] (::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Bool, String, Int64, Bool, Bool})(easy::Downloads.Curl.Easy)
    @ Downloads /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Downloads.jl:369
    [2] with_handle(f::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Bool, String, Int64, Bool, Bool}, handle::Downloads.Curl.Easy)
    @ Downloads.Curl /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Curl/Curl.jl:64
    [3] #8
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Downloads.jl:311 [inlined]
    [4] arg_write(f::Downloads.var"#8#17"{Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Bool, String, Int64, Bool, Bool}, arg::IOStream)
    @ ArgTools /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/ArgTools/src/ArgTools.jl:112
    [5] #7
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Downloads.jl:310 [inlined]
    [6] arg_read
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/ArgTools/src/ArgTools.jl:61 [inlined]
    [7] request(url::String; input::Nothing, output::IOStream, method::Nothing, headers::Vector{Pair{String, String}}, timeout::Int64, progress::Nothing, verbose::Bool, throw::Bool, downloader::Nothing)
    @ Downloads /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Downloads.jl:309
    [8] (::Downloads.var"#3#4"{Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Nothing, String})(output::IOStream)
    @ Downloads /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Downloads.jl:222
    [9] open(f::Downloads.var"#3#4"{Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Nothing, String}, args::String; kwargs::Base.Iterators.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:write,), Tuple{Bool}}})
    @ Base ./io.jl:330
    [10] arg_write(f::Function, arg::String)
    @ ArgTools /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/ArgTools/src/ArgTools.jl:86
    [11] download#2
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Downloads/src/Downloads.jl:221 [inlined]
    [12] macro expansion
    @ ~/data/destination/code.jl:386 [inlined]
    [13] macro expansion
    @ ~/.julia/packages/ProgressMeter/sN2xr/src/ProgressMeter.jl:938 [inlined]
    [14] top-level scope
    @ ~/data/destination/code.jl:385
    in expression starting at /home/user/data/nomads_julia/code.jl:377

I did some additional testing. I do not think that the fault is on Julia 1.6.6. I am also 100% sure that the server from which I am downloading data is ok.

I think that the problem is associated with my non standard Julia setup. I am running Julia inside a privileged docker container that contains the systemd initialization system and is entirely run from ramdisk on a virtual machine located in the cloud with two ports exposed. Yeah I know - however, I have to admit that I was testing this setup for almost two months and it was working really great. The only problem I found so far is when trying to download slightly more files then usual with Julia. This, together with docker setup must be causing some network problems.

My question would be - what can I do to find the exact cause of it? Is there maybe any software that I can use to diagnose the problem? I’d really appreciate even the slightest hint as I do not know where to start and I was really happy with this setup so far.

Well, I updated underlaying system (VERSION=“20.04.4 LTS (Focal Fossa)”, Kernel: 5.11.0-1028-oracle on Host: KVM/QEMU (Standard PC (i440FX + PIIX, 1996) pc-i440fx-4.2) and with Docker version 20.10.14, build a224086), I made a new docker build (OS image: Debian GNU/Linux bookworm/sid x86_64), rebooted the VM and the problem maybe a little less frequent but it is still there.

My current assumptions are that:
a) local traffic must be going through a proxy process with possible docker solution: --userland-proxy=false;
b) RSC (receive side coalescing) packets violating MTU and WinNat;
or
c) Virtualization Layer being incompatible with WinNat;
but this kind of network / docker stuff is a bit of a black magic for me and I am afraid that finally I might break the whole system.

Another assumption to follow is to make --net=host docker option working but it seems that in such case my remote desktop can not get the license when I try this option.

I will try to find info:

  • how can I increase “the period of 20 seconds” to a larger value?;
  • if any other backend like aria2c can be used instead of curl for Downloads.download?;
    or
  • how to program Julia not to crash but to repeat the problematic file download until its done correctly.

Should you have any suggestions, I would really appreciate.

Edit: I would like to underline that the problem is occurring at various stages of the download process. I am downloading about 5000 small files. Sometimes it is happening at 15%, sometimes at 80% of the whole download process.

Edit: Topic edited (indicated that the problem is occurring inside a docker container)

I did additional testing. On x86 and ARM Neoverse, major cloud provider, new VMs, paravirtualized and hardware assisted networking, no docker present at all. My initial assumption was that it might be related to my slightly non standard setup with docker. However, according to my current best knowledge 1.7.2 is simply not stabile when downloading files (at least in my case).

If anybody come by this thread, I am providing some hints to the questions I was interested in:

i) how to increase “the period of 20 seconds” to a larger value? [after Expose setting CURLOPT_LOW_SPEED_TIME · Issue #168 · JuliaLang/Downloads.jl · GitHub]

downloader = Downloads.Downloader()
downloader.easy_hook = (easy, info) → Downloads.Curl.setopt(easy, Downloads.Curl.CURLOPT_LOW_SPEED_TIME, 60)
Downloads.download(url, “foo.json”; downloader=downloader)

ii) is any other backend like aria2c available instead of curl for Downloads.download?

I do not thinks so, AFAIK, by default there is no such option. Maybe there is a trick similar to the one with CURLOPT_LOW_SPEED_TIME, however, I am not aware about it currently.

iii) is it possible to program Julia not to crash but to repeat the problematic file download?

Maybe (don’t know), but in the default / current form, Julia just crashes.

P.S.
I did also some testing outside Julia with aria2. With aria2 I am able to download all the files without any problems. I have not been testing curl outside Julia (reason: I have never used it extensively for such purposes, most of my experience for large number of downloads relates to aria2 and Julia).

EXAMPLE CORRECT DOWNLOAD:

  • Couldn’t find host host.name in the .netrc file; using defaults

  • Trying ip:443…

  • Connected to host.name (ip) port 443 (#0)

  • mbedTLS: Connecting to host.name:443

  • mbedTLS: Set min SSL version to TLS 1.0

  • ALPN, offering h2

  • ALPN, offering http/1.1

  • Couldn’t find host host.name in the .netrc file; using defaults

  • Found bundle for host host.name: 0x12321c20 [serially]

  • Server doesn’t support multiplex (yet)

  • Hostname host.name was found in DNS cache

  • Trying ip:443…

  • Connected to host.name (ip) port 443 (#1)

  • mbedTLS: Connecting to host.name:443

  • mbedTLS: Set min SSL version to TLS 1.0

  • ALPN, offering h2

  • ALPN, offering http/1.1

  • mbedTLS: Handshake complete, cipher is TLS-ECDHE-RSA-WITH-AES-256-GCM-SHA384

  • Dumping cert info:

  • cert. version : 3

  • serial number : 68:AC:A0:36:8D:0E:46:63

  • issuer name : C=US, ST=Arizona, L=Scottsdale, O=GoDaddy.com, Inc., OU=Repository, CN=Go Daddy Secure Certificate Authority - G2

  • subject name : CN=*.domain

  • issued on : 2021-08-03 14:59:52

  • expires on : 2022-09-04 14:59:52

  • signed using : RSA with SHA-256

  • RSA key size : 2048 bits

  • basic constraints : CA=false

  • subject alt name :

  • dNSName : *.domain
    
  • dNSName : domain
    
  • key usage : Digital Signature, Key Encipherment

  • ext key usage : TLS Web Server Authentication, TLS Web Client Authentication

  • certificate policies : ???, ???

  • ALPN, server accepted to use http/1.1

  • SSL connected

GET /cgi-bin/database.path HTTP/1.1
Host: host.name
Accept: /
User-Agent: curl/7.73.0 julia/1.7

  • Mark bundle as not supporting multiuse
    < HTTP/1.1 200 OK
    < Date: Mon, 02 May 2022 01:41:52 GMT
    < Server: Apache
    < X-Frame-Options: SAMEORIGIN
    < X-Content-Type-Options: nosniff
    < X-XSS-Protection: 1; mode=block
    < Content-Transfer-Encoding: binary
    < Content-Disposition: attachment; filename=“file.name”
    < Content-Description: type file
    < X-Frame-Options: SAMEORIGIN
    < X-Content-Type-Options: nosniff
    < X-XSS-Protection: 1; mode=block
    < Content-Type: application/octet-stream
    < Via: 1.1 host.name-80
    < Cache-Control: max-age=14400
    < Expires: Mon, 02 May 2022 05:41:53 GMT
    < Transfer-Encoding: chunked
    < Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
  • Added cookie NSC_ESNS=“1376a6a7-36e0-126f-9678-00e0ed62d4ec_3524236521_1888331066_00000000004620540398” for domain host.name, path /, expire 1651455727
    < Set-Cookie: NSC_ESNS=1376a6a7-36e0-126f-9678-00e0ed62d4ec_3524236521_1888331066_00000000004620540398; Path=/; Expires=Mon, 02-May-2022 01:42:07 GMT

EXAMPLE ERROR:

  • Mark bundle as not supporting multiuse
    < HTTP/1.1 200 OK
    < Date: Mon, 02 May 2022 01:13:43 GMT
    < Server: Apache
    < X-Frame-Options: SAMEORIGIN
    < X-Content-Type-Options: nosniff
    < X-XSS-Protection: 1; mode=block
    < Content-Transfer-Encoding: binary
    < Content-Disposition: attachment; filename=“file.name”
    < Content-Description: type file
    < X-Frame-Options: SAMEORIGIN
    < X-Content-Type-Options: nosniff
    < X-XSS-Protection: 1; mode=block
    < Content-Type: application/octet-stream
    < Via: 1.1 host.name-80
    < Cache-Control: max-age=14400
    < Expires: Mon, 02 May 2022 05:13:46 GMT
    < Transfer-Encoding: chunked
    < Strict-Transport-Security: max-age=31536000; includeSubdomains; preload

  • Added cookie NSC_ESNS=“1363f330-3047-126f-9678-00e0ed62d4ec_1976895326_3612821726_00000000038979603737” for domain host.name, path /, expire 1651454038
    < Set-Cookie: NSC_ESNS=1363f330-3047-126f-9678-00e0ed62d4ec_1976895326_3612821726_00000000038979603737; Path=/; Expires=Mon, 02-May-2022 01:13:58 GMT
    <

  • Connection #1 to host host.name left intact
    Downloads.download(“https://host.name/cgi-bin/database.path”, “output”; downloader = downloader, verbose = true, timeout = 90) = output"

  • Operation too slow. Less than 1 bytes/sec transferred the last 60 seconds

  • Closing connection 0
    ERROR: LoadError: TaskFailedException

    nested task error: Operation too slow. Less than 1 bytes/sec transferred the last 60 seconds while requesting https://host.name/cgi-bin/database.path
    Stacktrace:
    [1] (::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Bool, String, Int64, Bool, Bool})(easy::Downloads.Curl.Easy)
    @ Downloads ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Downloads.jl:369
    [2] with_handle(f::Downloads.var"#9#18"{IOStream, Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Bool, String, Int64, Bool, Bool}, handle::Downloads.Curl.Easy)
    @ Downloads.Curl ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Curl/Curl.jl:64
    [3] #8
    @ ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Downloads.jl:311 [inlined]
    [4] arg_write(f::Downloads.var"#8#17"{Base.DevNull, Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Bool, String, Int64, Bool, Bool}, arg::IOStream)
    @ ArgTools ~/julia-1.7.2/share/julia/stdlib/v1.7/ArgTools/src/ArgTools.jl:112
    [5] #7
    @ ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Downloads.jl:310 [inlined]
    [6] arg_read
    @ ~/julia-1.7.2/share/julia/stdlib/v1.7/ArgTools/src/ArgTools.jl:61 [inlined]
    [7] request(url::String; input::Nothing, output::IOStream, method::Nothing, headers::Vector{Pair{String, String}}, timeout::Int64, progress::Nothing, verbose::Bool, throw::Bool, downloader::Downloader)
    @ Downloads ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Downloads.jl:309
    [8] #3
    @ ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Downloads.jl:222 [inlined]
    [9] open(f::Downloads.var"#3#4"{Nothing, Vector{Pair{String, String}}, Int64, Nothing, Bool, Downloader, String}, args::String; kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:write,), Tuple{Bool}}})
    @ Base ./io.jl:330
    [10] arg_write(f::Function, arg::String)
    @ ArgTools ~/julia-1.7.2/share/julia/stdlib/v1.7/ArgTools/src/ArgTools.jl:86
    [11] download#2
    @ ~/julia-1.7.2/share/julia/stdlib/v1.7/Downloads/src/Downloads.jl:221 [inlined]
    [12] macro expansion
    @ ./show.jl:1047 [inlined]
    [13] (::var"#8#10"{String, Downloader, String})()
    @ Main ./task.jl:423
    Stacktrace:
    [1] sync_end(c::Channel{Any})
    @ Base ./task.jl:381
    [2] macro expansion
    @ task.jl:400 [inlined]
    [3] top-level scope
    @ ~/juliatest.jl:396
    in expression starting at /home/ubuntu/juliatest.jl:377

Edit: Removed “Docker” from the title.

Solved with HTTP.jl and UrlDownload.jl.