Is there a possibility to prevent task switches in a block of code?

jling · November 27, 2023, 2:52pm

please link to the documentation that says task-switching only happens to a limited list of functions. It doesn’t make any guarantee, because it’s hard and that’s precisely the point here.

schlichtanders · November 27, 2023, 2:56pm

you are argument seems to go like this:
“you cannot disprove me, hence I am right”

just because task-switching is difficult (I don’t know whether it is difficult actually), it still does not mean that Julia works like you are saying

Benny · November 27, 2023, 2:59pm

I thought thread spawning was based on tasks so task-related issues should apply to both. The blogpost seems to affirm that yielding does happen at deterministic points, it’s just hard to keep track of so many independent factors, let alone make any guarantee.

Don’t try to reason about yielding

Many existing uses of thread local state happen to be relatively robust and give correct answers only because the functions they are calling during execution do not yield. One may then think “well, I can just avoid this problem by making sure my code doesn’t yield”, but we think this is a bad and unsustainable idea, because whether or not a function call will yield is not stable, obvious, or easily inspectable.

For instance, if a function f is updated to include a background @debug statement or other forms of non-user-visible IO, it may change from being non-yielding to yielding. If during a call to f, the compiler encounters a dynamic dispatch where new code must be JIT compiled, a yield-point may be encountered, and any number of other internal changes could happen to code which can cause it to yield.

Could that go deep enough to remove all task switches in general in a block? Or could that break printing and such?

schlichtanders · November 27, 2023, 3:03pm

Thank you so much for quoting these important passages! I haven’t read them indeed.

My usecase is to interact with git - i.e. some external system.

So I guess the julian approach to make sure that several git commands are actually executed one after the other is to have a global git lock which you acquire and release.

jling · November 27, 2023, 3:04pm

sorry if that sounds like my argument and sorry for not being expert enough to the Julia source code

There are no documentation, only discussions I recall and Julia source code if you want “proof”, I will defer this question to other experts you may find more trustworthy than me.

jling · November 27, 2023, 3:05pm

julia> f(i) = (read(`git --version`, String); return i)

julia> read(`git --version`, String)
"git version 2.43.0\n"

julia> let state = zeros(Int, Threads.nthreads()), N=100
           @sync for i ∈ 1:N
               @async state[Threads.threadid()] += f(i)
           end
           sum(state), sum(1:N)
       end
(100, 5050)

Sukera · November 27, 2023, 3:08pm

This isn’t really a Julia specific issue, this is a regular concurrency problem. You have a shared resource (the external git repo) that must not be modified concurrently. I don’t know whether git itself protects that resource or not (likely not, since it’s fair to assume that only one git invocation runs at a given time), so to be safe, you’d need a lock or other exclusive-ownership mechanism in any language. Not to mention that tasks waiting on the result of an external command by default wait and thus yield to other tasks, thereby allowing other tasks to spawn their own git command that they then wait on.

There are a number of such concurrency problems that are NOT caused by multithreading, but rather only exposed more easily by it. The underlying issue is protection of that shared resource when two or more tasks access the resource concurrently, which is not necessarily the same as simultaneously.

schlichtanders · November 27, 2023, 3:12pm

Let me summarize what I learned:

it could potentially be possible to write something which prevents task switches, as not every julia function yields, but quite a lot (maybe there is some underlying taskswith method called)
the general julia recommendation is to not rely on such potential prevent_taskswitch logic (which does not exist yet), but rather write code in a way which is agnostic to taskswitches

For the application of interacting with external systems it still might be helpful to have a prevent_taskswitch wrapper.

As long as this does not exist the only alternative I can think of is locking.

jling · November 27, 2023, 3:14pm

Pretty much yea. Although you might also be interested in task_local_storage depending on what your application is.

Mason · November 27, 2023, 3:14pm

It’s a bit more subtle here actually, because @async does something nasty to avoid task migration. See in particular:

help?> @async
  @async


  Wrap an expression in a Task and add it to the local machine's scheduler queue.

  Values can be interpolated into @async via $, which copies the value directly into the
  constructed underlying closure. This allows you to insert the value of a variable, isolating
  the asynchronous code from changes to the variable's value in the current task.

  │ Warning
  │
  │  It is strongly encouraged to favor Threads.@spawn over @async always even when no
  │  parallelism is required especially in publicly distributed libraries. This is
  │  because a use of @async disables the migration of the parent task across worker
  │  threads in the current implementation of Julia. Thus, seemingly innocent use of
  │  @async in a library function can have a large impact on the performance of very
  │  different parts of user applications.

So @async actually does protect from task migration in a way that @spawn doesn’t, but it’s also not as good at this as the docstring warning might have one beleive (as @jling showed via sleep), but such guard rails should not be trusted anyways.

It’s like running into a window on a high building and saying "what’s the problem? the window will catch me. "

Mason · November 27, 2023, 3:15pm

I beg you to just read the blogpost. We cover this exactly

schlichtanders · November 27, 2023, 3:17pm

I went over the blogpost and it recommends task local state to solve the problem

that is just not a possibility when interacting with an external system like git

Benny · November 27, 2023, 3:17pm

Wait, I thought task migration was about a task being scheduled on a possibly different thread when restarted. Even if it doesn’t migrate, pausing and restarting are still task switches, right? Or am I misunderstanding things horribly?

Oh nvm, it just clicked, the topic there is thread-local state, not task switches.

Mason · November 27, 2023, 3:21pm

This is why in the original version of the blogpost, I wanted to demonstrate the bug with @async, not @spawn, but we went with @spawn because we explciitly document that we don’t want users to use @asnyc ~~@spawn~~. In the current iteration of the blogpost, you’ll notice however that we demonstrate the bug using julia --threads=1.

That’s because multithreading is actually irrelevant to this problem. It’s a concurrency problem, not a problem directly to do with multithreading. Multithreading just makes it manifest more because people assume that the number of concurrent tasks will equal the number of threads, and that tasks won’t migrate to new ~~task~~ thread IDs

Sukera · November 27, 2023, 3:22pm

I think you mean thread IDs there

Benny · November 27, 2023, 3:23pm

Wait, do you mean “don’t want users to use @async” there? That’s what the blogpost reads to me

Mason · November 27, 2023, 3:23pm

oops, yeah

Mason · November 27, 2023, 3:24pm

The julia docs have explictly said not to use @asnyc anywhere for a long time now. It only exists for backwards compatibility.

Sukera · November 27, 2023, 3:24pm

You can pause & restart your task as much as you want, as long as you protect your shared resource from other tasks that would also like to modify that resource for as long as you need it to be protected for. If that means hiding the resource behind a lock until the original task exits, then that’s what you need to do.

Mason · November 27, 2023, 3:24pm

That is simply untrue. If you post a dummy example we can help show you how to use concurrency-safe patterns here.

Topic		Replies	Views
List of what actions can cause a task switch? General Usage	2	360	July 24, 2019
Julia 1.7 says it can switch the thread your task is on. How often does that happen, and how can it be disabled? General Usage task , threads	6	1383	January 28, 2022
Task scheduling semantics Internals & Design	6	1259	August 12, 2021
Prevent Task from yielding General Usage	1	322	August 8, 2019
@schedule considered harmful Internals & Design design	8	2250	April 27, 2018

Is there a possibility to prevent task switches in a block of code?

Related topics