HTTP.jl async is slow compared to python+aiohttp

mkitti · March 30, 2023, 5:46am

Introduction

I went through several steps of optimization of the Julia code. Basically we need to make sure we are running compiled code and that we have optimized memory usage as much as possible.

I used bmon to check my network usage. The first thing I noticed below was that the peak receiving bandwidth was higher with asyncio. This made me wonder if there connection limit was holding HTTP.jl back.

The effective wall times are now comparable for me after optimization. Your situation may require further tuning with a higher bandwidth connection.

Python / asyncio

Here’s what I see with Python asyncio.run:

In [11]: %time cdatas = asyncio.run(get_items(keys))
CPU times: user 18.7 s, sys: 3.81 s, total: 22.5 s
Wall time: 37.7 s

Initial Julia code

Here is what I see with your Julia code:

julia> const urls = map(i->"https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1/analysed_sst/$i.1.0", 0:99);

julia> @time asyncmap(url->HTTP.request("GET", url, status_exception=false).body, urls);
 73.768142 seconds (12.42 M allocations: 5.409 GiB, 0.81% gc time, 15.31% compilation time)

Compiled function

Putting this into a function and making sure it gets compiled, I then get the following results via Julia:

julia> function f()
           asyncmap(url->HTTP.request("GET", url, status_exception=false).body, urls);
       end

julia> @time f()
 51.969691 seconds (3.01 M allocations: 4.816 GiB, 0.42% gc time)

Julia optimization with preallocated buffers

Optimizations:

Use a function to compile the code
Make all the globals const
Increase the connection_limit
Preallocate the buffers
Using Threads.@spawn to allow tasks to use a threads.

julia> const urls = map(i->"https://mur-sst.s3.us-west-2.amazonaws.com/zarr-v1/analysed_sst/$i.1.0", 0:99);

julia> const buffers = [IOBuffer(; sizehint = 64*1024*1024, maxsize=64*1024*1024) for x in 1:100]

julia> function f()
           seekstart.(buffers)
           @sync map(urls, buffers) do url, buffer
               Threads.@spawn HTTP.request("GET", url, status_exception=false, connection_limit=25, response_stream=buffer)
           end
       end

julia> @time f()
 35.779649 seconds (5.80 M allocations: 176.242 MiB, 0.21% compilation time)

julia> seekstart.(buffers); read.(buffers)
100-element Vector{Vector{UInt8}}:
 [0x02, 0x01, 0x21, 0x02, 0x60, 0x38, 0xdc, 0x03, 0x00, 0x00  …  0x0f, 0x02, 0x00, 0x08, 0x50, 0xb5, 0xb5, 0xb5, 0xb4, 0xb4]
...

Discussion

Part of the optimization above is general to Julia. Make your globals const or at least binding them to a type assists in precompilation. Creating a function is also helpful for precompilation. Managing memory is also important and I suspect that this accounts for some difference.

Above we preallocated a lot of memory partially based on prior knowledge. This prior knowledge could be obtained via a single HTTP request to the following URL and then parsing the returned XML:

https://mur-sst.s3.us-west-2.amazonaws.com/?prefix=zarr-v1/analysed_sst

This uses the Amazon S3 ListObjectsV2 API:

We may not have to preallocate all the memory. We just need enough to handle the number of concurrent connections. We could then copy the memory out and then reuse the IOBuffers.

Topic		Replies	Views
HTTP.jl doesn't seem to be good at handling over 1k concurrent requests, in comparison to an alternative in Python? Web Stack	30	7591	December 16, 2020
Which async web server should I use? Web Stack	25	8765	May 8, 2020
Slow HTTP.jl requests when SSL is enabled Web Stack	9	1267	August 9, 2022
Julia can be better at doing web: A benchmark Web Stack performance , benchmark , sockets , http	27	4935	October 27, 2023
Julia HTTP.jl performance vs python requests General Usage http	1	963	May 4, 2021