I somehow sympathize with @MilesCranmer here about deprecating the threadid() construct.
Actually, my understanding is that using threadid() is safe provided that one uses the static scheduler (please correct me if I am wrong because this is the strategy I used to mitigate the problem in two of my packages)
Threads.@threads :static for a in 1:N
counter[Threads.threadid()] = ...
end
A possibility would be to raise a warning/deprecation only if one is NOT using the :static scheduler (again, provided that :static makes the use of threadid() safe). I do not know how feasible would be to raise a conditional warning depending on the choice of the scheduler however.
but it was never the API . I understand there were official sources recommending this pattern, but nonetheless this is not a breaking change in API weâre talking about. itâs simply buggy code.
I donât mean for this to become heated. I donât think youâre crazy for wanting to deprecate threadid I just donât happen to agree
you think the average user is supposed to infer that the creator of Julia was incorrect on an official blogpost?
absolutely not, I never said that.
let me list the things I agree with you on
The docstring of threadid (old and new alike) is rather curt & vague. especially before the note was added it would have been pretty hard for anybody without a good understanding of Juliaâs threading model to know what to assume is meant exactly
multithreaded code, in general, is really complicated and easy to get wrong especially in the absence of tools that can hold the usersâ hands
once code patterns are out there, they are very sticky and it can be very hard to effect community-wide changes
recommended usage patterns of threadid, from official sources, actively deceived users into writing buggy code
the previous fact is really REALLY unfortunate
All these facts are pretty indisputable. AND.
all that being said, I do not share your conclusion that deprecating threadid is a good course of action. again thatâs just my opinion and itâs possible Iâm in the minority here, but Iâd rather not remove a (now) correctly-documented and correctly-implemented tool on the basis that its usage was previously taught incorrectly.
I donât think anyone here is disagreeing about the general state of affairs. It is indeed very unfortunate and truly is well described as a trolley problem. And just like any good trolley problem, everyoneâs perspective on how to best throw any switch will vary based upon what youâre able to see, which side of the tracks you personally find yourself tied to, your perspective on the other downstream effects, and where you want to find the trolley at the end of the day. Itâs a tradeoff and your weighings of the outcomes may vary. Itâs valuable to also understand that othersâ weighings of the same situation may be different than yours, even if everyone is standing at the exact same spot with the exact same visibility.
Goâs choice here to forbid even looking at thread ids is quite fitting with its own worldview.
Unlike trolley problems, we can also build new tracks â weâre not limited to a binary choice here. Even better: some options arenât even mutually exclusive.
Thanks for the great summary. And glad to hear you appreciated the metaphor
Do you have any ideas for the ânew tracksâ here? In my view the best overall option (though somewhat controversial) is the logic here. A warning in the linter could also be a decent start since it is more aggressive and âin your faceâ than a blogpost or docstring update â it would indeed start pushing this into common Julia knowledge. Though it still wouldnât warn on legacy code, I do think it might save a fair chunk of all future casualties though.
Just want to emphasize the nuance here that @spawn is a perfectly legitimate tool and not a code smell on its own, but itâs low-level, and if youâre using it to multithread a loop you should probably combine it with manual chunking/work scheduling (e.g., using ChunkSplitters.jl). Though as @ericphanson pointed out, this is less important if each iteration takes a long time anyway (say, idk, several tens of microseconds or more).
Or maybe we can infer another heuristic from this thread: if youâre annoyed that thread_local_storage() canât persist/transfer between tasks, perhaps you should reduce the number of tasks and give each of them more work.
For commonly used patterns, OhMyThreads.jl is the go-to library for pre-packaged variants so you donât have to write the boilerplate yourself.
Yes, StaticLint.jl was my main thought and Iâm sure it would be quite noncontroversial. Your suggestion might work, but thatâs why I posted my archaeology note above; straight deprecations have already been tried. The core question is if this would reduce false-positives enough to make the evaluation any different than the last times this has been tried. I donât know the answer, donât have a strong opinion, and donât hold power here.
I understand this sentiment to the extent that you feel like @threads implicitly encourages incorrect threadid()-based caching patterns that people need to be steered away from. However, I havenât found a better tool for parallelizing simple loops that donât need thread-safe cache. It doesnât add a dependency, and it has lower overhead than anything Iâve tried except Polyester.jl (which has its own issues with composability). I donât think a blanket warning against @threads is warranted, especially if it drives people towards naive use of @spawn with loops instead.
I think the appropriate admonitions are approximately:
Never use threadid()
If @threads works for you, great, if not, use OhMyThreads.jl
If youâre tempted to use @spawn with loops, you should probably try OhMyThreads.jl first
If you still want to use @spawn with loops, read the blog post first, and probably also the ChunkSplitters.jl docs
Outside the context of parallelizing loops, use @spawn to your heartâs content the way you would use @async
For what itâs worth, my point isnât that people shouldnât ever use @threads. My point is just that itâs a bad API. If it does what you need to do, then great, but itâs generally limiting, and encourages bad practices which is frustrating.
I think this discussion boils down to whom the language should accomodate. Personally, Iâm perfectly fine with a language which allows me to write buggy code. Iâve been writing parallel code since VMS 5 (ca. 1988), in C, pascal, BLISS-32, you name it, and believe that anyone writing parallel code anyway must learn some basic synchronization techinques. In juliaâs case, also how tasks and threads interact, or may interact in the future. Automatic parallelization and/or synchronization has been a pipe dream for 40 years (since the Connection Machine, or before). Babysitting the developer by blocking potentially dangerous practices wonât help anyone.
This is kind of a dismissive characterization of the progress various languages have made in the past decades towards making it harder to accidentally write code that doesnât work. Even though Julia very much has an approach of letting people do anything they need to, we try very hard to avoid making easy to do the wrong thing unintentionally.
This is a kind of unfortunate API in that itâs easy to misuse, but thereâs no simple change that makes it that much harder to misuseâasking for the ID of a thread or a task just isnât a good fit for the way threading works in Julia. We let you ask for the thread ID because it is a thing you could potentially want to know, but since tasks can migrate threads, it doesnât work the way many people expect it to.
Sorry to jump into the middle of this, but is Threads.@threads really inefficient? Iâve been using a bit recently but didnât realise it was that bad. Is OhMyThreads.jl a direct replacement and more efficient?
No, Threads.@threads is not inefficient, actually quite the opposite. It may be flawed design-wise, but performance-wise it has less overhead than most alternatives.
The pattern @Mason is saying itâs âvery very very inefficientâ, is the following:
In code:
@sync for i in 1:huge_number
Threads.@spawn begin
small_fast_loop_body(i)
end
end