Seeking advice on asynchronous programming problem

benninkrs · August 8, 2025, 3:25am

I’m new to asynchronous programming and have a use case that I’m not sure how to express in terms of Julia concepts. I’d greatly appreciate any suggestions others may have to offer.

I have a task A, that consists of concurrent subtasks A1, A2, .. , An. Each subtask has a finite sequence of things to produce for a global consumer B, but at various points along its sequence it will need to wait for a response from B before continuing. However, B cannot interact with each subtask independently; due to interactions between the things produced by A1,…,An, B needs to wait until all the subtasks have reach a point where they waiting for a response from B. When this happens B will process everything they have produced in the meantime, then return responses to the respective subtasks so that they can continue. This back-and-forth continues until all the subtasks are finished, at which point A and B both conclude. They way I am thinking of the core interaction is that A1,…,An collectively yield to B and then B yields back collectively to A1,…,An, each one resuming where it left off. But I am not sure if that is a helpful way to think of it.

Any suggestions?

foobar_lv2 · August 8, 2025, 8:59pm

A pragmatic solution is to have two channels, one for work orders and one for work completions:

mutable struct CyclicBarrierChannel
  nWorkers::Int
  jobs::Channel{Nothing}
  completions::Channel{Nothing}
end

(::Type{CyclicBarrierChannel})(n) = CyclicBarrierChannel(n, Channel{Nothing}(n), Channel{Nothing}(n))

function workerDoneAndWait(barrier)
  put!(barrier.completions, nothing)
  take!(barrier.jobs)
end

function bossWaitForWorkers(barrier)
  for i=1:barrier.nWorkers
    take!(barrier.completions)
  end
end

function bossReleaseWorkers(barrier)
  for i=1:barrier.nWorkers
    put!(barrier.jobs, nothing)
  end
end

Now, you can of course also brew your own thing in analogy to CyclicBarrier (Java Platform SE 8 )

That’s appropriate if the code is mostly for yourself (yay learning experience), or if your code is too performance critical for the redundant extra thread switch from iteratively calling take!, or if you’re writing a big library for many users. Otherwise, the code review and maintenance burden for a latch is probably not worth it.

Unclap spoiler warning for a by-hand solution (not tested)

A non-pragmatic solution would be something like ``` mutable struct CyclicBarrier const nWorkers::Int @atomic nMissing::Int workers::Base.GenericCondition{ReentrantLock} boss::Base.GenericCondition{ReentrantLock} end

(::Type{CyclicBarrier})(nWorkers)=CyclicBarrier(nWorkers, nWorkers, Threads.Condition(), Threads.Condition())

function workerDoneAndWait(barrier)
@lock barrier.workers.lock begin
nMissing = @atomic barrier.nMissing -= 1
nMissing == 0 && @lock barrier.boss.lock notify(barrier.boss)
while @atomic barrier.nMissing == 0
wait(barrier.workers)
end
end
end

function bossWaitForWorkers(barrier)
@lock barrier.boss.lock while @atomic barrier.nMissing != 0
wait(barrier.boss)
end
end

function bossReleaseWorkers(barrier)
@lock barrier.workers.lock begin
@assert 0 == @atomicswap barrier.nMissing = barrier.nWorkers
@assert barrier.nWorkers == notify(barrier.workers)
end
end

</details>

benninkrs · August 9, 2025, 6:06pm

Thanks, this is giving me some directions to try. For now this is for a personal project and my own education, but I’m still interested in having reasonably good performance. I’m curious why you say the solutions above are not performant or non-pragmatic.

foobar_lv2 · August 9, 2025, 11:30pm

The pragmatic channel solution wakes up the “boss” task (your task B) on every completion of a worker. This is of course silly, it only needs to be woken when the last worker completes. Hence, it has about 2x the scheduler overhead.

But it is obviously correct, and its obvious to write and to reason about, easy to explain to people with an analogy (“oh, we hand out work-orders and completion receipts between workers and boss thread, using a well-known channel abstraction”).

The extra 2x scheduler overhead should not matter very much for performance in most usecases: If scheduler overhead is a serious limitation, then your algorithm sucks and you should fix that.

On the other side, writing and reviewing the kind of locking / atomics code of the by-hand solution is not entirely trivial, and bugs happen. The empirical rate of race condition bugs in “let’s do concurrency by hand, can’t be so hard”-code is very large.

Yet, it’s not aesthetically pleasing to burn all these perfectly fine CPU cycles for “humility” when one has an “obviously” correct solution. (but I must admit that I have produced plenty of data races in “obviously correct” code, due to faulty reasoning).

So it comes down to your project goals. Building your own is more fun and educational and aesthetically pleasing; but certainly not pragmatic, the stupid wasteful work-slip channel should be fast enough.

benninkrs · August 10, 2025, 5:38am

Not quite the same, but FWIW 1.12 will have a waitall function (julia/NEWS.md at v1.12.0-beta1 · JuliaLang/julia · GitHub).

Topic		Replies	Views
Background processes in julia General Usage	3	1639	September 16, 2018
Asynchronous variable change Julia at Scale	2	754	June 25, 2019
Basic examples of Tasks/Channels? (or more verbose documentation?) General Usage	10	4802	November 2, 2017
@sync usage and better documentation for Asynchronous Programming New to Julia task	1	619	December 7, 2020
@sync @async with non-loop counter General Usage	0	495	March 18, 2018

Seeking advice on asynchronous programming problem

Related topics