Julia seems an order of magnitude slower than Python when printing to the terminal, because of issue with "sleep"

MilesCranmer · June 28, 2024, 7:37am

I just mean blocking the main thread. In this model you would want to allocate an extra thread when starting Julia so that one thread can just be used for sleeping. (I agree this is not a great solution, but regular sleep is so heavy that there’s no other option).

Really? Is this a recent change? (Does this mean I can dynamically create a thread for sleeping?)

Yeah it’s not a perfect solution. But on Linux, Libc.systemsleep is much quicker than Base.sleep (100x faster – lower end of 10 microseconds compared to 1 millisecond), which is enough of a win for me to want to use it.

It would be nice if there was waitany version of wait(::Channel) to react to any channel in a Vector{Channel} being written to. Right now I have to loop through them individually and then sleep – that sleep is what’s bottlenecking my performance a bit. Although wait isn’t great either, because my main thread is also doing other stuff in between polls. So a fast sleep is really useful here.

Sukera · June 28, 2024, 8:01am

That’s been a thing since 1.9 - see also PSA: Thread-local state is no longer recommended for some examples of how relying on nthreads being constant leads to buggy code.

There will be with 1.12:

github.com/JuliaLang/julia

Add waitany and waitall functions to wait multiple tasks at once

JuliaLang:master ← mrkn:wait_multiple_tasks

opened 08:20AM - 15 Feb 24 UTC

mrkn

+270 -0

I would like to propose adding two functions, `waitany` and `waitall`, discussed… in the issue #53226. These functions wait for multiple tasks at once. The `waitany` function blocks until one task finishes. The `waitall` function blocks until all tasks finish. There is an optional keyword argument, `failfast`, for the `waitall` function. The default of `failfast` is `false`. The `waitall` function will immediately stop if any task ends with an exception when the `failfast` is `true`. This is my own implementation, but I have regrets about the type of the first argument. I wanted to represent a container type from which `Task` objects can be taken out using the `iterate` function, but it seems impossible in current Julia, so I used a union type of `AbstractVector`, `Tuple`, and `Set`. I would like to know if there is a better way to write this part.

MilesCranmer · June 28, 2024, 9:20am

Nice! Thanks!

So I guess Threads.@spawn Libc.systemsleep is still the best option for now. Maybe I could also refactor the code to use waitany such that the main thread’s only job is waiting and dispatching.

MilesCranmer · June 28, 2024, 12:02pm

I had another read into this and the reasons for blocking. I think the best approach to this problem is @threadcall: Multi-Threading · The Julia Language which is designed for this exact problem:

External libraries, such as those called via ccall, pose a problem for Julia’s task-based I/O mechanism. If a C library performs a blocking operation, that prevents the Julia scheduler from executing any other tasks until the call returns. (Exceptions are calls into custom C code that call back into Julia, which may then yield, or C code that calls jl_yield(), the C equivalent of yield.)

The @threadcall macro provides a way to avoid stalling execution in such a scenario. It schedules a C function for execution in a separate thread. A threadpool with a default size of 4 is used for this. The size of the threadpool is controlled via environment variable UV_THREADPOOL_SIZE. While waiting for a free thread, and during function execution once a thread is available, the requesting task (on the main Julia event loop) yields to other tasks. Note that @threadcall does not return until the execution is complete. From a user point of view, it is therefore a blocking call like other Julia APIs.

So perhaps the very best way to solve this problem is as follows:

function systemsleep(seconds::Number)
    microseconds = round(Int, 1e6 * seconds)
    @threadcall(:usleep, Int, (Int,), microseconds)
    return nothing
end

which is lighter than Base.sleep:

julia> using BenchmarkTools

julia> @btime systemsleep(1e-6)
  19.208 μs (11 allocations: 496 bytes)

julia> @btime sleep(1e-6)
  1.152 ms (4 allocations: 112 bytes)

without blocking the thread, and also doesn’t require the hand-rolled @spawn trick.

(This seems like a nice addition to Base maybe?)

The one downside of this is:

@threadcall may be removed/changed in future versions of Julia.

So just be wary.

MilesCranmer · June 28, 2024, 12:34pm

Formal feature request: Feature request: Finer-grained sleep function · Issue #54971 · JuliaLang/julia · GitHub

ufechner7 · June 28, 2024, 12:42pm

I made a similar request years ago, which is still open: Accuracy and resolution of sleep() on Linux should be improved · Issue #12770 · JuliaLang/julia · GitHub

MilesCranmer · June 28, 2024, 1:41pm

Thanks!

(slightly ironic that an issue about sleep being slow has been sitting idle for 9 years )

ChrisRackauckas · June 28, 2024, 1:55pm

We’re really sleeping on fixing that issue, but no one knows for how long.

Topic		Replies	Views
Calling a function at a given time? General Usage	13	1032	September 11, 2017
Why is printing to a terminal slow? Performance	28	5357	November 24, 2021
Julia slower than Python to sort and reverse a list of integers Performance	40	2585	April 28, 2023
Latency discrepancy in a busy loop with IO and without General Usage	2	87	May 6, 2025
Bug in sleep() function - main thread work affecting sleep duration on running tasks General Usage question , multithreading , task , potential-bug	47	2189	January 23, 2025

Julia seems an order of magnitude slower than Python when printing to the terminal, because of issue with "sleep"

Related topics