Web server with latency-sensitive and compute-heavy task separate thread pools

simsurace · February 2, 2023, 7:36pm

I need to set up a web server that mainly serves requests which are computationally heavy and multithreaded, but should remain responsive to health checks (from a Kubernetes cluster) which are latency sensitive. I want to make sure that there is no competition for resources between the two tasks. What is the easiest way to accomplish this with Genie or Oxygen? Or should I go lower level with HTTP.jl?

What I‘ve tried so far

Use TaskPoolEx(background=true) for the parallel loops of the computationally heavy tasks. However, this does not prevent all CPU cores from being grabbed by those tasks while some of them are in the single-theaded stages.
Trying to set up Julia 1.9 with one interactive thread and putting `Thread.@spawn :interactive‘ middleware around the handler function for the latency-sensitive route. However, there were some errors thrown when starting up the Genie server in this thread configuration.

djholiver · February 2, 2023, 8:09pm

Hi,

This is what I modify and have running in k8s:
(32) JuliaCon 2020 | Building Microservices and Applications in Julia - YouTube

Regards

quinnj · February 2, 2023, 11:06pm

In Julia 1.9 w/ HTTP.jl, we spawn the main server loop on the :interactive threadpool by default. If you then do a Threads.@spawn from a handler, then those threaded tasks should run elsewhere and not in the interactive threadpool, keeping it responsive.

Note, however, that just having a reserved :interactive thread doesn’t guarantee CPU cycles, so you still need to properly configure your running Julia session appropriately (i.e. ensuring non-interactive threads available to run tasks is -1 the # of cores or something).

simsurace · February 3, 2023, 9:50pm

Thanks. This is reassuring.
Would this behavior be inherited by Genie and/or Oxygen, or do these still need some additional features or special config?

And is there a way of getting guaranteed similar behavior in Julia <1.9, short of using two separate Julia processes with Distributed?

simsurace · February 3, 2023, 10:00pm

Thanks, I haven‘t found the time yet to watch this but I will do so!

quinnj · February 3, 2023, 10:03pm

For this kind of behavior pre-1.9, I used to use GitHub - JuliaServices/WorkerUtilities.jl: Utilities for working with multithreaded "workers" in Julia services/applications, with WorkerUtilities.init and WorkerUtilities.@spawn, but that only really works if you control most of the task-spawning in your application. With the number of growing packages that use Threads.@spawn themselves, it can be difficult to reserve cpu use.

simsurace · February 3, 2023, 10:16pm

I see. I tried WorkerUtilities with Genie and my library and I think I sometimes got a nested task error, but I will double check.

simsurace · February 6, 2023, 8:21am

I watched the workshop video and it was excellent, I‘ve learned a lot, thanks a lot! I think that in my case it makes sense to work with HTTP.jl directly to have this additional control over threading.

I like the Workers module there with the explicit loop of threads 2:ntheads() listening to the heavy workload. In my case I have to try and see how that composes with something like TaskPoolEx, such that the parallel portion of the task will just stay on those threads as well, keeping thread 1 for the main server loop. Maybe it makes more sense to have only thread 2 pick tasks off the queue then (the serial portion is negligible and I want to avoid task switching where it is not needed).

So ideally something like this could be set up even in Julia 1.8:

Thread 1 runs the main server loop and answers the cheap requests
Thread 2 waits for heavy tasks to be put on the queue and then uses threads 2:nthreads() for the parallel portion

simsurace · February 7, 2023, 9:53am

Ok, I tried to get a basic MWE of a server which handles low-latency tasks on the interactive thread (where the main server loop is run) and spawns a default thread for the high-latency task.

Here is the server script:

using Dates, HTTP

const ROUTER = HTTP.Router()

function low_latency(req)
    Threads.@spawn :interactive begin # also tried without this
        @info "$(now())> $(Threads.threadpool()) tid: $(Threads.threadid())"
        return "Done with low latency task"
    end
end
HTTP.register!(ROUTER, "GET", "/low", low_latency)

function high_latency(req)
    Threads.@spawn begin
        s = sum(randn(10^8))
        @info "$(now())> $(Threads.threadpool()) tid: $(Threads.threadid()), sum = $s"
        return "Done with high latency task"
    end
end
HTTP.register!(ROUTER, "GET", "/high", high_latency)

function requestHandler(req)
    resp = ROUTER(req)
    return HTTP.Response(200, resp)
end

HTTP.serve(requestHandler, "0.0.0.0", 7777)

So I have a couple of questions:

When I run the server with Julia -t N,1 where N>1, I get an error when hitting it with a request: Error: handle_connection handler error, exception = │ AssertionError: 0 < tid <= v

Full stacktrace

┌ Error: handle_connection handler error
│   exception =
│    AssertionError: 0 < tid <= v
│    Stacktrace:
│      [1] _length_assert()
│        @ URIs ~/.julia/packages/URIs/6ecRe/src/URIs.jl:694
│      [2] access_threaded(f::typeof(URIs.uri_reference_regex_f), v::Vector{URIs.RegexAndMatchData})
│        @ URIs ~/.julia/packages/URIs/6ecRe/src/URIs.jl:685
│      [3] parse_uri_reference(str::String; strict::Bool)
│        @ URIs ~/.julia/packages/URIs/6ecRe/src/URIs.jl:125
│      [4] parse_uri_reference
│        @ ~/.julia/packages/URIs/6ecRe/src/URIs.jl:123 [inlined]
│      [5] URI
│        @ ~/.julia/packages/URIs/6ecRe/src/URIs.jl:146 [inlined]
│      [6] gethandler
│        @ ~/.julia/packages/HTTP/z8l0i/src/Handlers.jl:391 [inlined]
│      [7] (::HTTP.Handlers.Router{typeof(HTTP.Handlers.default404), typeof(HTTP.Handlers.default405), Nothing})(req::HTTP.Messages.Request)
│        @ HTTP.Handlers ~/.julia/packages/HTTP/z8l0i/src/Handlers.jl:427
│      [8] requestHandler(req::HTTP.Messages.Request)
│        @ Main ./REPL[7]:2
│      [9] (::HTTP.Handlers.var"#1#2"{typeof(requestHandler)})(stream::HTTP.Streams.Stream{HTTP.Messages.Request, HTTP.ConnectionPool.Connection{Sockets.TCPSocket}})
│        @ HTTP.Handlers ~/.julia/packages/HTTP/z8l0i/src/Handlers.jl:58
│     [10] #invokelatest#2
│        @ ./essentials.jl:816 [inlined]
│     [11] invokelatest
│        @ ./essentials.jl:813 [inlined]
│     [12] handle_connection(f::Function, c::HTTP.ConnectionPool.Connection{Sockets.TCPSocket}, listener::HTTP.Servers.Listener{Nothing, Sockets.TCPServer}, readtimeout::Int64, access_log::Nothing)
│        @ HTTP.Servers ~/.julia/packages/HTTP/z8l0i/src/Servers.jl:447
│     [13] macro expansion
│        @ ~/.julia/packages/HTTP/z8l0i/src/Servers.jl:385 [inlined]
│     [14] (::HTTP.Servers.var"#16#17"{HTTP.Handlers.var"#1#2"{typeof(requestHandler)}, HTTP.Servers.Listener{Nothing, Sockets.TCPServer}, Set{HTTP.ConnectionPool.Connection}, Int64, Nothing, Base.Semaphore, HTTP.ConnectionPool.Connection{Sockets.TCPSocket}})()
│        @ HTTP.Servers ./task.jl:514
└ @ HTTP.Servers ~/.julia/packages/HTTP/z8l0i/src/Servers.jl:461

When I run the server with Julia -t 1,1 I get no error but note that

julia> fetch(Threads.@spawn :interactive Threads.threadpool())
:default

so both tasks end up running on the same thread anyways, which affects latency.

So I’m obviously doing something wrong. Could someone help me get back on track?

AMJ · February 7, 2023, 10:40am

Have you looked into ThreadPinning.jl ?

simsurace · February 7, 2023, 10:53am

Yes, thanks! But it does not change the fact that when I start Julia with -t 1,1 there is only one thread available:

$ julia +beta -t 1,1 -q
julia> using ThreadPinning; threadinfo()

| **0**,1,2,3,4,5,6,7 | 

# = Julia thread, | = Socket seperator

Julia threads: 1
├ Occupied CPU-threads: 1
└ Mapping (Thread => CPUID): 1 => 0,

EDIT: maybe not surprising, I think ThreadPinning.jl does not support interactive threads yet. It is only tested on Julia 1.6 and 1.7 in the CI pipeline.

simsurace · February 7, 2023, 4:58pm

Created an issue for the thread behavior Interactive threads in Julia v1.9.0-beta do not give expected results · Issue #48580 · JuliaLang/julia · GitHub

And for the server failing

github.com/JuliaWeb/HTTP.jl

Using interactive thread pool for latency-sensitive route

opened 05:05PM - 07 Feb 23 UTC

closed 02:26PM - 02 May 23 UTC

simsurace

This is my (failed) attempt at separating light vs. heavy workloads in different… threads to retain responsiveness: ```julia using Dates, HTTP const ROUTER = HTTP.Router() function low_latency(req) Threads.@spawn :interactive begin # also tried without this @info "$(now())> $(Threads.threadpool()) tid: $(Threads.threadid())" return "Done with low latency task" end end HTTP.register!(ROUTER, "GET", "/low", low_latency) function high_latency(req) Threads.@spawn begin s = sum(randn(10^8)) @info "$(now())> $(Threads.threadpool()) tid: $(Threads.threadid()), sum = $s" return "Done with high latency task" end end HTTP.register!(ROUTER, "GET", "/high", high_latency) function requestHandler(req) resp = ROUTER(req) return HTTP.Response(200, resp) end HTTP.serve(requestHandler, "0.0.0.0", 7777) ``` - Running this in `julia +beta -t 1,1` does not give the intended behavior, maybe due to julialang/julia#48580. - Running this with `julia +beta -t 2,1` will throw an error upon receiving a request: ``` ┌ Error: handle_connection handler error │ exception = │ AssertionError: 0 < tid <= v │ Stacktrace: │ [1] _length_assert() │ @ URIs ~/.julia/packages/URIs/6ecRe/src/URIs.jl:694 │ [2] access_threaded(f::typeof(URIs.uri_reference_regex_f), v::Vector{URIs.RegexAndMatchData}) │ @ URIs ~/.julia/packages/URIs/6ecRe/src/URIs.jl:685 │ [3] parse_uri_reference(str::String; strict::Bool) │ @ URIs ~/.julia/packages/URIs/6ecRe/src/URIs.jl:125 │ [4] parse_uri_reference │ @ ~/.julia/packages/URIs/6ecRe/src/URIs.jl:123 [inlined] │ [5] URI │ @ ~/.julia/packages/URIs/6ecRe/src/URIs.jl:146 [inlined] │ [6] gethandler │ @ ~/.julia/packages/HTTP/z8l0i/src/Handlers.jl:391 [inlined] │ [7] (::HTTP.Handlers.Router{typeof(HTTP.Handlers.default404), typeof(HTTP.Handlers.default405), Nothing})(req::HTTP.Messages.Request) │ @ HTTP.Handlers ~/.julia/packages/HTTP/z8l0i/src/Handlers.jl:427 │ [8] requestHandler(req::HTTP.Messages.Request) │ @ Main ./REPL[7]:2 │ [9] (::HTTP.Handlers.var"#1#2"{typeof(requestHandler)})(stream::HTTP.Streams.Stream{HTTP.Messages.Request, HTTP.ConnectionPool.Connection{Sockets.TCPSocket}}) │ @ HTTP.Handlers ~/.julia/packages/HTTP/z8l0i/src/Handlers.jl:58 │ [10] #invokelatest#2 │ @ ./essentials.jl:816 [inlined] │ [11] invokelatest │ @ ./essentials.jl:813 [inlined] │ [12] handle_connection(f::Function, c::HTTP.ConnectionPool.Connection{Sockets.TCPSocket}, listener::HTTP.Servers.Listener{Nothing, Sockets.TCPServer}, readtimeout::Int64, access_log::Nothing) │ @ HTTP.Servers ~/.julia/packages/HTTP/z8l0i/src/Servers.jl:447 │ [13] macro expansion │ @ ~/.julia/packages/HTTP/z8l0i/src/Servers.jl:385 [inlined] │ [14] (::HTTP.Servers.var"#16#17"{HTTP.Handlers.var"#1#2"{typeof(requestHandler)}, HTTP.Servers.Listener{Nothing, Sockets.TCPServer}, Set{HTTP.ConnectionPool.Connection}, Int64, Nothing, Base.Semaphore, HTTP.ConnectionPool.Connection{Sockets.TCPSocket}})() │ @ HTTP.Servers ./task.jl:514 └ @ HTTP.Servers ~/.julia/packages/HTTP/z8l0i/src/Servers.jl:461 ```

simsurace · February 10, 2023, 9:32am

Update: I opted for a dedicated worker process with Distributed. That seems to be a robust solution until the interactive thread issues get ironed out.

Topic		Replies	Views
Julia Genie API - handling multiple requests General Usage	1	412	May 15, 2024
Reducing Genie latency Performance genie	2	992	May 17, 2022
Discussion on ThreadPools.jl Performance	5	940	July 26, 2021
How to run a HTTP.jl server in parallel, while doing computations in the foreground? General Usage multithreading , server , httpjl	21	1184	May 8, 2024
Dedicated threads for handling requests to endpoints with HTTP.jl General Usage question , web , multithreading , http	3	1111	February 22, 2022

Web server with latency-sensitive and compute-heavy task separate thread pools

Related topics