I am trying to write a code that has a for loop inside which an array is broken into chunks, sent to multiple processors, and FFT is performed on those chunks. Without using garbage collection GC.gc()
command at the end of each iteration, however, memory usage keeps increasing eventually causing the computer to crash. On the other hand, garbage collection is causing a slow down by a factor of more than 10. This is prohibitively expensive for the project I am working on. To reproduce the issue, I wrote a simple example as follows:
using Distributed
addprocs(2)
x_in = rand(100, 100)
@everywhere workers() begin
using FFTW
const fft_plan = plan_fft!(ComplexF64.($x_in))
function update_y(y_in)
for i in 1:100
y_in .= fft_plan * y_in
end
return y_in
end
end
function test(x)
for t in 1:1000
@show t
@everywhere workers() begin
y = ComplexF64.(copy($x))
y .= update_y(y)
# GC.gc()
end
end
return nothing
end
@time test(x_in)
On my laptop (julia 1.10.4, Ubuntu 22, i5 processor), it takes about 8 seconds to run this code. If I uncomment GC.gc()
line, it takes about 132 seconds.
I have been trying to find a way to permanently allocate memory for the FFT operation but I do not understand how I may do that. Even with the in-place FFT plan, memory keeps blowing up.
I would really appreciate any help. Thank you so much.