I have a loop that does some linear algebra. Each iteration is independent of each other. I need some helper arrays in each iteration. How can I make each thread allocate its helper arrays once before the loop?
In the example below, I would like to make each thread allocated a single time its own copy of a and b. How can that be done?
function foo(n)
out = Array{Float64}(undef, n)
Threads.@threads for i = 1:n
a = Array{Float64}(undef, n, n)
b = Array{Float64}(undef, n)
for j = 1:n
for k = 1:n
a[k,j] = j^k
end
b[j] = i * j
end
out[i] = (a\b)[1]
end
return out
end
For the moment you may be able to do something like
function foo(n)
out = Array{Float64}(undef, n)
abufs = [Array{Float64}(undef, n, n) for _ in 1:Threads.nthreads()]
bbufs = [Array{Float64}(undef, n) for _ in 1:Threads.nthreads()]
Threads.@threads for i = 1:n
a = abufs[Threads.threadid()]
b = bbufs[Threads.threadid()]
although this depends on the internal detail on how tasks are scheduled (so it may not be safe in the future and there are usecases it’s already unsafe). See also: Task affinity to threads
FYI, note also that BLAS and Julia threading do not play very well ATM: