This issue was already discussed in in the thread Overhead of Threads.@threads
you linked above. I was the one starting that thread and ended up implementing my own macro @threaded
, that checks whether only a single thread is used. If so, it just executes the code without multithreading. Otherwise, it uses Threads.@threads
, see introduce at-threaded macro wrapping Base at-threads by ranocha · Pull Request #426 · trixi-framework/Trixi.jl · GitHub. It’s not perfect but it removed some overhead for us. In particular, it removed all allocations in our use cases when using only a single thread, which is really helpful for debugging performance problems. We should probably switch to something like ThreadingUtilities.jl, CheapThreads.jl, but these packages do not have a public API yet (according to the dev docs of both).
Sidenote: Your example code does not work with multiple threads since they all try to update the same s
.