I’m implementing an simple API server that is supposed to handle over concurrent 1k requests at a time.
I have a code like below:
using HTTP: @register, Response, handle
const HEADERS = ["Content-Type" => "application/json"]
bench_start_handler(req) = nothing
hoge_handler(req, body) = ... # return JSON string
... # another handler body
bench_end_handler(req) = nothing
const router = HTTP.Router()
@register(router, "GET", "/bench_start", bench_start_handler)
@register(router, "POST", "/hoge", hoge_handler)
... # another handlers
@register(router, "GET", "/bench_end", bench_end_handler)
function handler(req)
body = isempty(req.body) ? handle(router, req) : handle(router, req, String(req.body))
return body === nothing ? Response(200) : Response(200, HEADERS; body = body)
end
# entry point
# -----------
function init_server(host::IPAddr = ip"0.0.0.0", port = 3000; async = true, verbose = true, kwargs...)
if async
server = Sockets.listen(host, port)
@async HTTP.serve(handler, host, port; server = server, verbose = verbose, kwargs...)
return server # supposed to be `close`d afterwards in an interactive session, etc
else
return HTTP.serve(handler, host, port; verbose = verbose, kwargs...)
end
end
Where handlers only do such quiet simple tasks that I’m sure they couldn’t be a source of performance problem.
Our benchmark starts with the /bench_start
request as a notification and then over 1k requests will keep to come to a server and a server handles them with various handlers (say, there ~5 handlers). A server will end up handling approximately ~300,000 requests, and finally benchmark ends with the /bench_end
request.
When I benchmarked this HTTP.jl server, the code itself works, but it turned out that this implementation is too slow in comparison to an alternative implementation in another languages, bjoern and falcon in Python.
I can’t provide a detail of the benchmark since it’s not public one, but I would like to say HTTP.jl does seem to be slow at doing “handle request → send response” loop concurrently and so the benchmark result was >100 times worse than the alternative implementation in Python.
My question is below:
- HTTP.jl is supposed to be good at handling concurrent requests in comparison to those HTTP server implementations in another languages ?
- Am I missing something ? Does my code include some mistakes ?
Any help or insight is very much appreciated !