Julia seems an order of magnitude slower than Python when printing to the terminal, because of issue with "sleep"

I just mean blocking the main thread. In this model you would want to allocate an extra thread when starting Julia so that one thread can just be used for sleeping. (I agree this is not a great solution, but regular sleep is so heavy that there’s no other option).

Really? Is this a recent change? (Does this mean I can dynamically create a thread for sleeping?)

Yeah it’s not a perfect solution. But on Linux, Libc.systemsleep is much quicker than Base.sleep (100x faster – lower end of 10 microseconds compared to 1 millisecond), which is enough of a win for me to want to use it.

It would be nice if there was waitany version of wait(::Channel) to react to any channel in a Vector{Channel} being written to. Right now I have to loop through them individually and then sleep – that sleep is what’s bottlenecking my performance a bit. Although wait isn’t great either, because my main thread is also doing other stuff in between polls. So a fast sleep is really useful here.

That’s been a thing since 1.9 - see also PSA: Thread-local state is no longer recommended for some examples of how relying on nthreads being constant leads to buggy code.

There will be with 1.12:

1 Like

Nice! Thanks!

So I guess Threads.@spawn Libc.systemsleep is still the best option for now. Maybe I could also refactor the code to use waitany such that the main thread’s only job is waiting and dispatching.

I had another read into this and the reasons for blocking. I think the best approach to this problem is @threadcall: Multi-Threading · The Julia Language which is designed for this exact problem:

External libraries, such as those called via ccall, pose a problem for Julia’s task-based I/O mechanism. If a C library performs a blocking operation, that prevents the Julia scheduler from executing any other tasks until the call returns. (Exceptions are calls into custom C code that call back into Julia, which may then yield, or C code that calls jl_yield(), the C equivalent of yield.)

The @threadcall macro provides a way to avoid stalling execution in such a scenario. It schedules a C function for execution in a separate thread. A threadpool with a default size of 4 is used for this. The size of the threadpool is controlled via environment variable UV_THREADPOOL_SIZE. While waiting for a free thread, and during function execution once a thread is available, the requesting task (on the main Julia event loop) yields to other tasks. Note that @threadcall does not return until the execution is complete. From a user point of view, it is therefore a blocking call like other Julia APIs.

So perhaps the very best way to solve this problem is as follows:

function systemsleep(seconds::Number)
    microseconds = round(Int, 1e6 * seconds)
    @threadcall(:usleep, Int, (Int,), microseconds)
    return nothing
end

which is lighter than Base.sleep:

julia> using BenchmarkTools

julia> @btime systemsleep(1e-6)
  19.208 μs (11 allocations: 496 bytes)

julia> @btime sleep(1e-6)
  1.152 ms (4 allocations: 112 bytes)

without blocking the thread, and also doesn’t require the hand-rolled @spawn trick.

(This seems like a nice addition to Base maybe?)

The one downside of this is:

@threadcall may be removed/changed in future versions of Julia.

So just be wary.

Formal feature request: Feature request: Finer-grained sleep function · Issue #54971 · JuliaLang/julia · GitHub

I made a similar request years ago, which is still open: Accuracy and resolution of sleep() on Linux should be improved · Issue #12770 · JuliaLang/julia · GitHub

Thanks!

(slightly ironic that an issue about sleep being slow has been sitting idle for 9 years :smile:)

We’re really sleeping on fixing that issue, but no one knows for how long.

5 Likes