Bug in sleep() function - main thread work affecting sleep duration on running tasks

ofc

I got 10s again, I can only increase it if I run mainspin 2 for 11s or more to stall the sleeper 2 task. What’s your code for the unique 10s sleeper? Here’s mine:

function sleeper(s, id; libc=false)
    @info "sleeper $id executing on thread $(threadid()) "
    counter = 0
    @time while counter < 1  ##change###
        libc ? Libc.systemsleep(s) : sleep(s)  ##change##
        counter += 1
    end
    @info "sleeper $id finished"
end

It has nothing to do with a 10-second timespan: any configuration that is showcasing the issue will do.

We can change the experiment from sleep altogether if you want:

  1. Start a non-blocking HTTP.jl server (which is scheduled on :interactive).
  2. Run bursts of work on the main thread (putting mainspin(5,1, steps=20) in a loop might do the job - see the updated mainspin).
  3. Try to get a response from that server - the responsivity will highly depend on when you hit the server (e.g., for mainsping(5,1, steps=20) you’ll get delays depending on main working behavior). Not something that you would expect from a task running on the :interactive.
using Base.Threads, Dates
using HTTP

function mainspin(s, id; steps=1)
    @info "mainspin $id executing on thread $(threadid()) "
    ms = Millisecond(s * 1000)

    r = 0.0
    for _ in 1:steps
        t = now()
        @time while (now() - t) < ms
            r += rand()
        end    
        sleep(1)    
    end
    @info "mainspin $id finished "
end

HTTP.serve!(Returns(HTTP.Response("ok")))

# steps is not relevant, it only buys you time to
# play around with the server
# not counting the sleep in `mainspin`, here we will have about 100 seconds
# and 200 seconds if we do mainspin(10, 1, steps=20)
mainspin(5, 1, steps=20) 


# now go in the terminal and play around
# curl 127.0.0.1:8081

Aah, gotcha!

I think that is to be expected though :thinking: There’s certainly an argument to be made for seperating the scheduler from the :interactive thread(s), but if you saturate all interactive threads, I would expect tasks also running on the interactive threads to stall. That’s the downside of cooperative multitasking in Julia, in contrast to preemptive multitasking (which e.g. the linux kernel does).

There have been wishes for preemption in the past though, so it might become a thing some day. I don’t know of any concrete efforts towards that yet. Some arguments against that are that some programs may rely on cooperative multitasking for correctness, or that it can introduce overhead in programs that expect to have exclusive access to all threads.

But this is not the case in the MWE - because even if I start with 12 interactive threads and schedule a single task on :interactive (which mostly sleeps), the work on the main will still impact the working task (which can migrate on any o the :interactive threads). And - for additional clarity: in the MWE, no other work is done besides the main thread related work and the actual realworker - which behaves OK. realwoker is only added as a control: realworker behaves correctly, sleeper depends on the main thread.

That’s definitely a bug then; please do open an issue at Issues · JuliaLang/julia · GitHub

I think this explains what is going on:

I had a Genie-related issue for weeks now - and I wasn’t able to track it down until yesterday when I tried to help solving this issue. I assumed that the complexity of my project and the difficulty to write proper tests to catch the responsivity issue was on me - and I was in the process of refactoring stuff.

~10 hours later - trying to invalidate my conclusions (I initially assumed that it is just about communication between tasks - so I did invalidate some) I arrived at an even more upsetting conclusion: you cannot do proper (or predictible/interactive) multithreading in Julia if your project involves medium-heavy work on main and your tasks are somehow using libuv. Just run my HTTP.jl MWE from an above post.

So - there is hard to say how many Julia packages are poisoned in this way:

  1. It is not straightforward what tests should be run to determine if there is an issue.
  2. If you want to go into source code for anything you use, it is pretty time-consuming (the language users should not need to do such things).
  3. Many toy examples from documentation will not reveal the issue.

For me, it is just incredible that this is happening. Imagine not being able to properly run a server on a different thread because its responsivity will be conditioned on not doing anything significant on the main thread.

2 Likes

I wholeheartedly agree with you - these kinds of things need to be fixed if people want Julia to expand beyond HPC & scientific computing (though I imagine this can also happen in those domains).

1 Like

Agreed this should be fixed. But for a server, does the problem go away if you just spawn a new thread for any nontrivial work? Like the main task could just literally be

function main()
  spawn_stuff()
  wait_til_done()
end

Or is that not sufficient?

1 Like

That would probably work around it, but you have to be very careful to not run any non-trivial computation on the main thread which can be very annoying.
It’s similar to the advice to not block the thread running the event-loop in GUI programming. Usually, the event loop should run in its own thread though, i.e., not on main.

2 Likes

In Julia as I understand it there’s sometimes confusing terminology. People talk about spawning threads with @spawn for example. Unless I’m mistaken:

  1. @spawn creates tasks which can run on any thread
  2. Thread migration was introduced a while back, maybe version 1.5 ish
  3. If you create a task that can migrate you don’t really have control over what thread that task runs on

Given all of that, is the issue that if you have a high compute task and it gets scheduled onto thread number 1 everything borks? Or is it if you do computation in the task running the REPL it borks?

2 Likes

Good point, how I understand the discussion everything borks if the task happens to run on the thread that’s listening to events – which happens to be the main one, i.e., running all top-level and single threaded code?

2 Likes

The main thread is part of the interactive threadpool, so the best advice is perhaps to schedule computationally intensive tasks on the default threadpool instead (they can migrate within that threadpool to improve load balance, but won’t block the interactive one).

2 Likes

You are right - however, in the current scenario (MWE running with nthreads > 1), the actual behavior looks like a task spawned on a free and interactive thread is migrating to the main thread where intensive work is being done (I tested it even with 12 threads - the same result). Obviously, I am not saying that is literally what happens.

I am pretty confident that in the current MWEs scenario (both the one in OP and the sever related one) it is not just a misuse/misinterpretation of @spawn concept.

I think we agree that not having control over where a task will run doesn’t imply that the task will migrate to the most busy thread when it can migrate on any of the other 10 free threads. In the MWEs context, although migration is not actually happening, the responsivity of the tasks running on a different thread from the main thread is actually dependent on the main thread being free (however, keep in mind the MWE context - I am not saying that any task is behaving the same).

1 Like

I agree with this - however if I have 12 interactive threads and my task can migrate on any of them… I wouldn’t expect the responsivity of that task to be dependent on the main thread.

I agree that we can do some common-sense reconfiguration of the MWE and keep everything responsive. But again, the issue is that certain types of tasks depend on the main thread (regardless of running on :interactive or :default threadpool). We can jump around the issue, but it still brings with it a pretty extreme set of restrictions relative to how we should design programs in Julia.

1 Like

I think, at this point, it is not so relevant if we can or cannot find a workaround for the specific issue illustrated by the MWEs (we can).

The general advice of keeping the main thread for doing trivial computations might be the best practice, but we can still get into serious trouble.

Consider the following MWE:

using Base.Threads, Dates
using HTTP

function sleeper(s, c,)
    counter = 0
    @time while counter < c
        sleep(s)
        counter += 1
    end
end

for _ in 1:1000
    @spawn :interactive sleeper(0.01, 1000)
end

readline()

You’ll notice that with the particular parameters I used above, the average time for running a while loop is about 13.2 seconds (expected: ~10 seconds). And the GC time cannot explain the additional 30+%.

That additional time comes from the `sleep’ battle on the main thread (notice that I do not run any computation on the main thread).

Spawn 10000 sleeping tasks, and the expected 10-second skyrockets to over 20 seconds (and 5% GC cannot explain 2x penalty).

Now imagine that HTTP.jl is spawning new tasks for each connection: those tasks are basically doing crazy main-related stuff.

So no, keeping the main thread free of serious computation is going to help - but the results are looking very sad.

You can have a big CPU with lots of threads and Julia will get worse as the number of threads increases.

On my machine, running the example above with 10.000 tasks and 4 threads will result in around 20 seconds per task. However, if I run the exact same code with 12 interactive threads, the total time per task jumps to> 30 seconds.

A kind of give it more resources so you can get worse results.

I underline the fact that the main thread is not running any other computation than initially spawning the tasks.

And here is an even crazier thing: you actually need to downgrade your server resources if the number of connections/users increases - because you will get a less responsive server as you scale up the number of threads.

Have you opened an issue on the repo with this as the minimal reproducible example yet? I think it’s quite important that this is getting tracked there.

5 Likes

Trying to catch up, if I’m understanding this correctly, Julia’s scheduler is only running on the main task, so the issue is that while the main task is too busy to schedule, it stalls tasks in the queue waiting for a thread? I never knew much about the implementation details, and the docs don’t really lay this stuff out.

Again, I don’t quite know the implementation details, but IIRC from the 1.9 highlights, this might be a problem? Before 1.9, every thread was “default”, and an available thread took whatever task was at the front of the queue. Now there’s 2 pools of threads, default vs interactive, and you schedule a task to either pool. AFAIK there’s nothing different about the 2 pools, fewer tasks in a pool is just more responsive so you can separate out the few you need.

Spawning 1000 tasks into the interactive pool seems like it’d defeat the purpose. I think that’s why (and I can replicate it) the @time reports increase with number of tasks. I don’t think it’s unexpected for 1000 0.01s sleeps to end up taking >10s; sleep is documented to block the task for a specified time, that does not imply the task starts up after that time, only that it’s ready to take the next opportunity. The more often tasks yield back to the queue (good for interactivity) and the more tasks in the queue (bad for interactivity), the more work the scheduler does to find the next ready task, and that adds up. Replacing sleeper(0.01, 1000) with sleeper(10.0, 1) brings the @time from 11.852s to 10.014s because it does less of that work.

You should also say how many default and interactive threads your Julia process was set to, nthreads(:default) and nthreads(:interactive).

Not quite - imagine that everything is scheduled and the only “work” done by the tasks is to sleep repeatedly for short periods of time. The problem is that sleep function (actually the low-level implementation) needs to access the main thread - and now you have a big number of tasks trying to access the same lock at the same time (I think is a spin lock). This is also why increasing the number of threads generates even worse results.

The MWE can also be changed by adding a Ref{Bool} and only switching it to true after we know that all the tasks were scheduled (and by doing this, we will get even worse results: because before the initial tasks actually had less competition - since there were still some tasks to be scheduled).

I think this also answers this: that does not imply the task starts up after that time, only that it’s ready to take the next opportunity.

Spawning on :default actually produces 3-5x worse results than on the :interactive.

Actually, you can decrease the number of tasks and only shorten the sleep interval and still get the same penalty.

I tried various combinations of :default/:interactive (with max 12 threads) - the best results were achieved by a high number of interactive threads.

I agree that this is the case in general - but I don’t observe this here. Especially given the fact that if I keep the tasks + yielding constant and I only increase the available threads things are worsening. Also - keeping the tasks + threads constant but increasing the yield does not improve interactivity - but on the contrary.

Maybe I’ll find the time to actually do some graphs on all this (also using regular yield vs sleep).
Also - please be aware that doing actual work for the desired interval instead of sleep is producing predictable results - and the scheduler has the same to do list (e.g., finding the next task).

1 Like

Another interesting experiment is to mix “real” workers with sleepers.

And the result is revealing: the same number of yields - but the real workers are doing well. The sleep users… not so well:

If the real problem is related to the scheduler work, you would also expect to see the other yielders being impacted. But we cannot see that.

using Base.Threads, Dates
using HTTP

@info "Started with $(nthreads()) threads"

const ison = Ref{Bool}(false)

function sleeper(s, c,)
    while !ison[]
        sleep(1)
    end
    counter = 0
    @time while counter < c
        sleep(s)
        counter += 1
    end
end

function worker(s, c,)
    while !ison[]
        sleep(1)
    end
    counter = 0
    x = 0.0
    @time while counter < c
        t = now()
        while (now() - t) < Millisecond(s * 1000)
            x += rand()
        end
        yield()
        counter += 1
    end
    @info "worker finished."
end

for i in 1:1000
    if i % 100 == 0
        @spawn :interactive worker(0.01, 1000)
    else
        @spawn :interactive sleeper(0.01, 1000)
    end
end


sleep(10)

ison[] = true

readline()
1 Like

Was this the benchmark with @spawn (which defaults to :default) or @spawn :interactive? Needs clarification with the various benchmarks showing up.

Not sure why this is, maybe with more threads yielding the scheduler has to do more work? I’d expect the @time to be worse as tasks wait more but the overall time (maybe add a big @time around everything) to be less as more tasks run at the same time.

Makes sense, more yields is just 1 condition for more interactivity, and it’s not an independent factor. When many tasks have to compete for a few threads, more yields is just more waiting.

I think it’s because workers are treated different from sleepers. When a thread finds a worker in the queue after its 1s big sleep, it can just take it. When a thread finds a sleeper after its 1s big sleep, it has to check if its little iterative sleeps are over, and if not it pushes it to the back of the queue and checks the next one.