Inconsistent CPU utilisation in @threads loops

Why do you make ms a TaskLocalValue and then perform a (strange) copy in the “parallel loop”? In your OP, M was just a single input matrix?

(I think you might be confusing input data with temporary task-local buffers.)

Not really.

If possible, you shouldn’t think about threads but tasks as Julia implements task-based multithreading. I recommend you take a look at JuliaUCL24/notebooks/Day3/2_multithreading.ipynb at main · carstenbauer/JuliaUCL24 · GitHub. Instead of accessing a pre-allocated element per thread you should likely be accessing a pre-allocated element per task and that’s what TaskLocalValue gives you.

But to answer your question, the following creates nthreads() tasks each running on a different thread and accessing an element of a pre-allocated buffer without using threadid().

data = rand(nthreads())
tforeach(1:nthreads(); scheduler=StaticScheduler()) do tid
    println(data[tid])
end

Don’t think that’s what you want to do though.

1 Like