IO Parallelism Strategies

mrufsvold · October 6, 2024, 11:09am

I have a use case with many small IOs (get/put S3 objects <300mb), a heartbeat (keep SQS message from timing out), and a heavy compute task which consumes a constant stream of inputs. The machine I’m using has 2 cores.

Recognizing that the real answer is “profile it”, what is your intuition for the best parallelism strategy? There are so many knobs to turn, and so much randomness created by the networking that I don’t think approaching optimization with pure experimentation is going to be fruitful.

I am trying to achieve:

The compute task is never waiting for more inputs or for the results to be put back to S3
The heartbeat isn’t blocked for so long that it fails to update the timeout.

One thing that is clear is that the main loop should kick off tasks for IO and heartbeat so that that they do not block execution of the heavy compute. However, there are several consideration for this.

How do I make sure the heart beat isn’t blocked for a long time by a long list of IO tasks?
I’ve seen (maybe outdated) posts that libuv poses some fundamental challenges to concurrent IO because of its global lock. AWS.jl suggests switching to the Downloads.jl backend (instead of HTTP.jl) if you need to do concurrent requests. Unfortunately, Curl.jl is segfaulting for me periodically, so I’m still using HTTP.jl until the patch mentioned in that issue makes it’s way to me. I’m not sure how this should affect my strategy (maybe @async is better than @spawn for this case? Would the advice change once I can move to Downloads.jl?)
I know that, under most circumstances, It is not advisable to have more Julia threads than logical cores. However, in this case where I have an ongoing heart beat that is very light weight but also needs to not be blocked, maybe I should run Julia with one :interactive and two :default threads and only schedule the heart beat on the interactive thread.

My current idea is to @spawn the heart beats to an :interactive thread, spawn an IO manager task which @async schedules all IO operations and leave the heavy compute on the main thread. But I’m open to the idea that I’m overthinking this and I should just @spawn everything naively.

ericphanson · October 6, 2024, 5:36pm

We tried this for exactly this use-case, heartbeat bumping a SQS timeout, and I think it helped but did not totally solve task starvation issues. Maybe @omus or @dave.f.kleinschmidt remembers more; I think maybe we just bumped the timeout amount so there’s more leeway.

I think this is stale, I just filed rm stale notes about HTTP.jl concurrency issues by ericphanson · Pull Request #694 · JuliaCloud/AWS.jl · GitHub to update those docs; that was only true pre-HTTP.jl 1.0 I believe.

mrufsvold · October 6, 2024, 8:01pm

When you say you tried this, were you also using one more Julia threads than you had logical cores in your machine?

And, thanks for the update about HTTP.jl! That’s great news!

ericphanson · October 6, 2024, 9:14pm

No, using n regular threads + 1 interactive thread on a n+1 core machine, so the total Julia threads matches the number of cores

Topic		Replies	Views
Parallelization of long- and short running tasks General Usage parallel , threads	8	500	July 16, 2024
How to execute tasks in parallel in a for loop Performance parallel , multithreading , juliapro , optimization	27	2050	November 29, 2023
What is julia doing with your threads? General Usage	23	1133	February 21, 2024
Asynchronous tasks vs multi-threading New to Julia question , parallel-computing	16	373	August 10, 2025
Web server with latency-sensitive and compute-heavy task separate thread pools Web Stack multithreading , latency	12	1006	February 10, 2023

IO Parallelism Strategies

Related topics