Comparison of different fft-implementations

ellocco · August 19, 2022, 4:29pm

Hi,
I am struggling with fft-analysis of real-world data.
Main objective are correct results.
Currently I use two different fft-implementations:

FFTW.fft()
SciPy.fft.fft()

The reason for this is that FFTW.fft() seems to be buggy.
One example is a data vector with exact 47499754 elements,
on my machine FFTW.fft() can not prosses this amount of data
points correctly, the amplitude is zero, while SciPy.fft.fft() does not fail
in this particular case.
Now I am investigating SciPy.fft.fft() and at the moment I have doubts,
if this implementation works correct.
Therefore, I’m interested to know, if there is a more accurate alternative to these two?

stevengj · August 19, 2022, 5:12pm

Works for me:

julia> x = rand(ComplexF64, 47499754);

julia> fft(x)
47499754-element Vector{ComplexF64}:
 2.3752510813742287e7 + 2.3752082562739518e7im
   -1857.327613402386 + 2656.595755623108im
   1.2335114129446083 - 1764.624248895477im
  -1396.5209606987896 + 1594.6473092040806im
   -994.9367333881451 - 693.3802670885018im
  -2928.6763801313805 - 1549.0585642899605im
    4545.894246165899 - 1879.5461946803975im
   1263.5570298623447 + 82.52881908200902im
  -1794.6733211109129 + 3696.6978993992598im
   -2795.508384720929 + 2427.6278018923563im
                      ⋮

If you use plan_fft with FFTW.MEASURE or FFTW.PATIENT, realize that it overwrites the data with zeros — this is a FAQ:

You should initialize your input array after creating the plan, unless you use FFTW_ESTIMATE : planning with FFTW_MEASURE or FFTW_PATIENT overwrites the input/output arrays, as described in the manual.

(Realize also that 47499754 is 2 \times 23749877 where 23749877 is prime. FFTW and recent versions of SciPy support large prime factors with O(n \log n) algorithms, but sizes with small prime factors will usually be at least 10\times more efficient.)

ellocco · August 19, 2022, 5:44pm

Thanks Steven,
why do you test with a ComplexF64 vector format?
Does it make sense to transfere Float64 number format to ComplexF64 before I call fft().
Here is the complete code that crashes on my machine in both Julia versions LTS and v1.8.0.

using PlotlyJS, Printf
using SciPy, FFTW

sampling_rate = 250000
window_length = 47499754
frequ1        = 0.1
ampl1         = 2.1
phase1        = 0.0

# --- time and data vector:
vec_t    = collect(range(0, step = 1.0 / sampling_rate, length = window_length)) 
td_data_ = ampl1 .* sin.( (2 * pi * frequ1 .* vec_t) .+ deg2rad(phase1))

# --- relevant range in fft-result (meaningful bins):
bins_range = range(1, step = 1, stop = ceil(Int, window_length / 2))

# --- Frequncy Vector --------------------------------------------------------------------------------------
t_win_duration = window_length / sampling_rate
frequ_vec = 1 / t_win_duration .* (0 : (ceil(Int, window_length / 2) - 1))
frequ_vec = frequ_vec[2:end]

# --- FFTW -------------------------------------------------------------------------------------------------
amplitudes_FFTW = abs.(FFTW.fft(td_data_))[bins_range]
amplitudes_FFTW = amplitudes_FFTW .* (2/window_length)
amplitudes_FFTW = amplitudes_FFTW[2:end]

# --- SciPy -------------------------------------------------------------------------------------------------
amplitudes_SciPy = abs.(SciPy.fft.fft(td_data_))[bins_range]
amplitudes_SciPy = amplitudes_SciPy .* (2/window_length)
amplitudes_SciPy = amplitudes_SciPy[2:end]

# --- Plot --------------------------------------------------------------------------------------------------
plt_range = range(1, length = 40)
hdl1 = PlotlyJS.scatter(; x = frequ_vec[plt_range], y = amplitudes_FFTW[plt_range],  name = "FFTW")
hdl2 = PlotlyJS.scatter(; x = frequ_vec[plt_range], y = amplitudes_SciPy[plt_range], name = "SciPy")
hdl_plt = PlotlyJS.Plot([hdl1, hdl2])

Jake · August 19, 2022, 6:10pm

I tried your code for FFTW and all worked fine. I have a 16 Gigabyte Win 11 machine. FFTW is probably one of the most robust FFT algorithms going.

I did not try the SciPy version and used Plots to plot.

I expect the FFTW website may answer your questions as potentially a search on discourse.

ellocco · August 19, 2022, 6:32pm

That is really strange, if I remove or add one element, the result is fine.
My OS is MS WIN 10, at home I will try under Linux.

stevengj · August 19, 2022, 7:25pm

Because fft natively is an operation on complex vectors. If you pass it a real array it will first make a complex copy of the input anyway.

If your input is purely real, you can save a factor of 2 by using rfft instead, which takes a real vector as input and only computes ≈half of the DFT outputs. (The other half of the output is just the complex conjugate for real inputs.)

Jake · August 19, 2022, 8:16pm

Results from my try>
fftw

ellocco · August 19, 2022, 9:28pm

In the end I was able to track down the issue. If I use the Intel’s Math Kernel Library (MKL) by giving the command: FFTW.set_provider!("mkl"), I observe the issue that appears for 47499754 points.
If I switch back by means of the command: FFTW.set_provider!("fftw")
the missing amplitude signal appears. Issue is solved
Another observation is:
If I use rfft() instead of fft() the fft-algorithm can handle 47499754 points, even if I turn on the MKL

The code snippet for rfft() is:

# --- RFFTW -------------------------------------------------------------------------------------------------
amplitudes_rfft = abs.(FFTW.rfft(td_data_))
amplitudes_rfft = amplitudes_rfft[1:end-1] .* (2/window_length)
amplitudes_rfft = amplitudes_rfft[2:end]

ellocco · August 20, 2022, 9:43am

I have investigated the topic a little bit closer, here is my sample code to understand better the situation:

using PlotlyJS, Printf
using FFTW
# remark:  you can switch between libraries by means of the command: FFTW.set_provider!("XYZ")
# details: https://github.com/JuliaMath/FFTW.jl 

num_periods   = 1  # a) if zero take special case of "47499754" points
                   # b) if -1 reduce number of points per one
                   # c) all other (fft-winow length holds num_periods)
b_plot_signal = false
sampling_rate = 100000
frequ1        = 0.1
ampl1         = 2.1
phase1        = 0.0 # in degree
ppp_frequ1    = trunc(Int, sampling_rate / frequ1)

if num_periods == 0
    window_length = 47499754 # special case where fft() might fail
    b_plot_signal = false
    if cmp(FFTW.get_provider(), "mkl") == 0
        @info("Provider \"mlk\" is enabled, spectrum might be disturbed!")
    else
        @info("Provider \"fftw\" is enabled, special case will not be seen!")
    end
elseif num_periods == -1
    window_length = 47499754 - 1 # one element less of special case where fft() might fail
    b_plot_signal = false
else
    window_length = trunc(Int, num_periods * ppp_frequ1)
end

# --- time and data vector:
vec_t    = collect(range(0, step = 1.0 / sampling_rate, length = window_length)) 
td_data_ = ampl1 .* cos.( (2 * pi * frequ1 .* vec_t) .+ deg2rad(phase1))

# --- relevant range in fft-result (meaningful bins):
bins_range = range(1, step = 1, stop = ceil(Int, window_length / 2))

# --- Frequncy Vector --------------------------------------------------------------------------------------
t_win_duration = window_length / sampling_rate
frequ_vec = 1 / t_win_duration .* (0 : (ceil(Int, window_length / 2) - 1))
frequ_vec = frequ_vec[2:end]

# --- FFTW -------------------------------------------------------------------------------------------------
cfft_complex    = FFTW.fft(td_data_)
cfft_complex    = cfft_complex[bins_range]
cfft_complex    = cfft_complex .* (2/window_length)
cfft_complex    = cfft_complex[2:end]
amplitudes_cfft = abs.(cfft_complex)
angles_cfft     = angle.(cfft_complex)

# --- RFFTW -------------------------------------------------------------------------------------------------
rfft_complex    = FFTW.rfft(td_data_)
rfft_complex    = rfft_complex .* (2/window_length)
rfft_complex    = rfft_complex[2:end]
amplitudes_rfft = abs.(rfft_complex)
angles_rfft     = angle.(rfft_complex)
offset_rfft     = abs(rfft_complex[1]) / 2
max_ampl, indx_max_ampl   = findmax(amplitudes_rfft)

# --- check if we have a full number of periods in the b_plot_signal
if window_length / ppp_frequ1 != trunc(window_length / ppp_frequ1)
    @info(@sprintf("Signal does not consist of full number of periods, increased risk of spectral leakage!"))
end

# --- some information:
println(@sprintf("FFTW Version:    %s, \t library provider: %s", FFTW.version, FFTW.get_provider()))
println(@sprintf("Frequency range: [%.2fHz .. %.2fHz]", frequ_vec[1], frequ_vec[end]))
println(@sprintf("Angle @%.2fHz:   %6.2fdeg, \t Amplitude: %.3f", frequ_vec[indx_max_ampl], rad2deg(angles_rfft[indx_max_ampl]), max_ampl))

# --- Plot --------------------------------------------------------------------------------------------------
if b_plot_signal
    plt_range = range(1, length = ppp_frequ1)
    hdl1 = PlotlyJS.scatter(; x = vec_t[plt_range], y = td_data_[plt_range],  name = "raw")
    hdl_plt = PlotlyJS.Plot([hdl1])
else
    plt_range = range(ceil(Int, 0.5 * indx_max_ampl), length = max(5, trunc(Int, 1.5 * indx_max_ampl)))
    hdl1 = PlotlyJS.stem(; x = frequ_vec[plt_range],    y = amplitudes_cfft[plt_range],  name = "fft")
    hdl2 = PlotlyJS.scatter(; x = frequ_vec[plt_range], y = amplitudes_rfft[plt_range],  name = "rfft")
    hdl_plt = PlotlyJS.Plot([hdl1, hdl2])
end

ellocco · August 21, 2022, 10:39am

Bonus:
Code snippet to give advice, whether it might be suitable to install the Intel Math Kernel Library (MKL):

using FFTW          # provides command: FFTW.get_provider()
using CpuId         # provides command: CpuId.cpuvendor()
if cmp(String(CpuId.cpuvendor()), "Intel") == 0 && cmp(FFTW.get_provider(), "mkl") != 0
    @info("Your CPU Vendor is Intel, you should install Intel’s Math Kernel Library (MKL)") 
    @info("You might issue the command: FFTW.set_provider!(\"mkl\")")
end

ValZapod · October 7, 2022, 7:05pm

Does this still happen with oneMKL 2022.2 from oneAPI sdk 2022.3? Horrible, BTW.

ellocco · October 7, 2022, 10:00pm

I do not know. I reported the issue in an related Intel help forum:
“Intel-oneAPI-Math-Kernel-Library”

Topic		Replies	Views
FFT in Julia > 0.6.3 General Usage	2	2661	January 15, 2019
Getting different output for fft function in julia and matlab Signal and Image Processing fftw , matlab , matrices	4	2861	June 22, 2021
CUDA fft against CPU fft GPU	4	2649	March 19, 2019
Problems using FFTW- SOLVED General Usage fftw	10	2287	January 29, 2020
Scaled FFT implementation seems to only use one thread despite setting no. of threads to 8 New to Julia fftw , multithreading	2	471	October 27, 2022

Comparison of different fft-implementations

Related topics