Why do you make ms
a TaskLocalValue
and then perform a (strange) copy in the “parallel loop”? In your OP, M
was just a single input matrix?
(I think you might be confusing input data with temporary task-local buffers.)
Not really.
If possible, you shouldn’t think about threads but tasks as Julia implements task-based multithreading. I recommend you take a look at JuliaUCL24/notebooks/Day3/2_multithreading.ipynb at main · carstenbauer/JuliaUCL24 · GitHub. Instead of accessing a pre-allocated element per thread you should likely be accessing a pre-allocated element per task and that’s what TaskLocalValue
gives you.
But to answer your question, the following creates nthreads()
tasks each running on a different thread and accessing an element of a pre-allocated buffer without using threadid()
.
data = rand(nthreads())
tforeach(1:nthreads(); scheduler=StaticScheduler()) do tid
println(data[tid])
end
Don’t think that’s what you want to do though.