I need to compute many convolutions of arrays with the same length and I want to use multiple threads for that. So i want each thread to compute a convolution with fft’s.
As I understand it right now the creating of the fft plan is not thread save, but the execution is.
(https://github.com/JuliaLang/julia/issues/17972)
Therefore it should work if just one fft plan for all threads is created in the beginning.
Doing it this way my kernel dies. Using separate plans for each thread is also not working.
I would be glad for some advice.
This is the an example that reproduces the issue:
using FFTW
FFTW.set_num_threads(1)
plan=Dict()
n=Threads.nthreads()
N=10
L=2^12
u=rand(L,N)
v=rand(L,N)
m=zeros(L,N)
# FFT plan for each thread
for i=1:n
plan[i]=plan_rfft([u[:,1]; zero(u[:,1])])
end
function convolution( u::Array{Float64}, v::Array{Float64},p::FFTW.rFFTWPlan{Float64,-1,false,1})
upad = [u; zeros(L)]
vpad = [v; zeros(L)]
return irfft((p * upad).*(p * vpad), Int(2*L))[Int(L/2):Int((3/2)*L-1)]
end
Threads.@threads for i=1:N
t=Threads.threadid()
m[:,i]=convolution(u[:,i],v[:,i],plan[t])
end