Fitting Sinusoidal Signal with LsqFit

Hi everyone, I have problems with fitting a sinusoidal signal with 4 parameters. Everything works just fine, if the Signal Frequency is under 800Hz. Above that Frequency the fitting is very bad (except for 802Hz and 804Hz), but I need to fit Data with a Frequency from 1Hz up to 50kHz (Sampling rate 250kHz for all measurements). The only way I get correct Results is to make the Difference between the Function Parameters (param) and the initial Values (p0) very small.

I am pretty new to Julia and I am wondering if there is a better way to fit sinusoidal Signals, so that the initial Values don’t have to be spot on? Or am I using the Function wrong?

I would be very grateful for any tips, Pkg suggestions or best practice examples. Thanks in advance!

using PlotlyJS
using LsqFit

sampling_rate = 250000
amplitude = 3
frequency = 800
angle = pi/3
offset = 2
# define Time Vector
vec_t = collect(range(start = 0, step = 1 / sampling_rate , stop = 1))
#_amplitude * cos(2 * pi * _frequency * _vec_t + _angle) +_offset;
@. model(t, p) = p[1] * cos((2 * pi * p[2]) * t + p[3]) + p[4]
# Create signal
param = [amplitude, frequency, angle, offset]
signal = model(vec_t, param) .+ 0.03 * randn(length(vec_t))
# Initial values for fit
p0 = [0.999 * amplitude, 1.001 * frequency, 0.999 * angle, 1.001 * offset]
# curve fit
fit = curve_fit(model, vec_t, signal, p0)
fitted_data = model(vec_t, fit.param)
n_points = trunc(Int, sampling_rate / frequency * 3)
Plot(hcat(signal[1 : n_points], fitted_data[1 : n_points]))

You could use the FFT to obtain a good initial guess of the sinusoidal frequency. See one such LsqFit example here.

You could also try making use of the ESPRIT method, see the docstring here for some examples

One thing is a good initial guess, if this is done via FFT, it is useful to investigate window functions, currently I do investigate the Kaiser window. On my data it is useful to play around with the parameter of the function besseli() (package SpecialFunctions.jl). In my case a sharp shape of the bessel function improves the result. Another thing is the window length, it makes sense to crop the window in a way that it contains a full number of periods.
Is there a function available that can do this job, e.g. by searching for the begin and end of a sinusoidal period in the time domain signal (in case we have a monochrome sinusoidal signal)?
A good initial guess if nice, but it would also be nice to have a optimization algorithm that is best suited to solve the sinusoidal-parameter fit.
I agree, probably the best approach is:
Fix the frequency, and based on the frequency, calculate the number of points that represent a fix number of periods, e.g. 10 periods.
On this window that contains a full number of periods perform the 3-param optimization.
What I have observed is:
The optimizer (solver) that is embedded in the LsqFit-package, does not seem to be best suited for this optimization task.
Does someone have a proposal which optimizer might be better suited?

Thanks for your implementation of the ESPRIT-algorithm!!!

I wrote my own example to compare the results of the two implementations of the ESPRIT-algorithm that I found in the Julia world, I hope it helps others to getting started:

using Printf, RobustModels # RobustModels provides "mean()"
using DSP, SingularSpectrumAnalysis # both provide implementations of esprit()

## --- compose signals: clean and disturbed --------------------------------------------------------------------------------
sampling_rate   = 10000                    # Sampling frequency
n_signal        = 3000                     # Length of signal
off_set_sign    = 1.0                      # vertical offset of signal
frequ_A         = 50.0                     # unit = Hz
frequ_B         = 120.0                    # unit = Hz
ampl_A          = 0.7                      # Amplitude @f1
ampl_B          = 1.0                      # Amplitude @f2
noise_ampl      = 0.3                      # max noise amplitude
b_symmetry_on   = true                     # substract arithmetic mean before decomposition

# --- time vector: vec_t
delta_t         = 1.0 / sampling_rate     # Time step in sec
vec_t           = collect(range(0, step = delta_t, length = n_signal))  # Time vector

# create signal with sinus at frequ_A and frequ_B:
signal_clean     = off_set_sign .+ ampl_A .* sin.(2 * pi * frequ_A .* vec_t)  +  ampl_B .* sin.(2 * pi * frequ_B .* vec_t)
signal_disturbed = signal_clean + noise_ampl * randn(size(vec_t))
if b_symmetry_on
    signal_disturbed    = signal_disturbed .- mean(signal_disturbed)
    signal_clean        = signal_clean     .- mean(signal_clean)
end

## --- main ----------------------------------------------------------------------------------------------------------------
# --- check if values for amplitude higth are meaningful and figure out, if one or two amplitudes are larger then 0
if ampl_A > 0       &&  ampl_B == 0
    num_of_frequ = 1
elseif ampl_A == 0  &&  ampl_B > 0
    num_of_frequ = 1
elseif ampl_A > 0   &&  ampl_B > 0 
    num_of_frequ = 2
else
    error("Either amplitude A or B needs to be positive!")
end
# --- calculate frequency via function "esprit()"
# source esprit function A: https://docs.juliadsp.org/stable/estimation/
# source esprit function B: https://github.com/baggepinnen/SingularSpectrumAnalysis.jl/blob/master/src/SingularSpectrumAnalysis.jl#L158
# --- size of correlation matrix: equation is according to van der Veen and Leus: (n_signal + 1)/3.
lag_embedding_dimension = round(Int, (n_signal + 1)/3) # Size of lag embedding and the covariance matrix used (size of correlation matrix).
# remark: the algorithm "ESPRIT()" expect, that the signal is symmetric around zero (mean(signal) = 0)
# source: https://ieeexplore.ieee.org/document/32276
esprit_frequ_methode_A = DSP.Estimation.esprit(signal_clean, lag_embedding_dimension, 2 * num_of_frequ, sampling_rate)
esprit_result = SingularSpectrumAnalysis.esprit(signal_clean, lag_embedding_dimension, num_of_frequ, fs = sampling_rate, robust = false)
esprit_frequ_methode_B = esprit_result.f
# ---
if size(esprit_frequ_methode_B)[1] > num_of_frequ
    error("Signal does not seem to be symetric around zero!")
end
# ---
if num_of_frequ == 2
    println(@sprintf("Frequency A: %.1f Hz, Frequ B: %.1f Hz, esprit result[1]: %.1f Hz, esprit_result[2]: %.1f Hz", 
    frequ_A, frequ_B, esprit_result.f[1], esprit_result.f[2] ))
end

From my example script I made a next step and tried to calculate an initial guess for the frequencies inside my real-world data, unfortunately, it turned out that both implementations of the esprit() algorithm fail on big real-world data with the error message OutOfMemoryError(). You can reproduce this error by increasing the signal-length in my script to e.g. 19999901.

2022-Sep-12:
Please have a look at:
ERROR: OutOfMemoryError() #22