I have a system that I am investigating, and I am trying to determine for which parameter values it exhibits which behaviours. There are several different ones, but a typical is that it oscillates. Sometimes it instead displays some kind of “stochastic pulsing” behaviour, where random pulses are triggered (for certain parameters it seems to be some kind of excitable system).
Now to my problem, I have an automated algorithm which classifies the behaviour of the system for a given parameter set, but I am having trouble to robustly determine whenever there’s an oscillation going on (or stochastic pulsing, or maybe just lots of white noise). I figured people here might have some input. How does one determine whenever there’s an oscillation in a signal?
Have you had a look at the data in the frequency domain? The data you describe as oscillatory might stand out in the frequency domain. It might also be predictable with a low-order autoregressive model.
Stichastic pulses are perhaps a bit harder to predict, and might have a flatter spectrum similar to that of white noise. Maybe an amplitude histogram might tell them apart, white noise has a symmetric amplitude distribution whereas the stochastic pulses might have a skewed distribution.
Initially I had some notion of trying to move into the frequency domain, but things got messy and I was unsure whenever I was on the wrong track, or simply misused some function.
I would use DSP.periodogram or welch_pgram rather than fft directly, they use fft internally. And you typically want to look at the magnitude of the fft rather than the real part. This is exactly what the periodigram does. Another alternative is to look at the cross correlation, e.g. StatsBase.crosscorr or ControlSystemIdentification.crosscorrplot.
An amplitude histogram is simply a histogram of the values, I think StatsBase fits a histogram for you.
AR models can be fit with many different packages, I use my package ControlSystemIdentification.jl, but there are many others. I would look at the fraction of variance explained by an AR model. But the periodigram might be easier for you to reason about if you’re new to this.
Still a bit uncertain about the interpretation. The big spike at 0 I just ignore, right? Then there’s a small spike at around 0.015. Does this correspond that I have about 0.015 oscillations per time units (multiplying up this gives about 30 oscillations over 2000 timeunits, which is what I have)
I understand the amplitude histogram, thanks, will look at it and see what information I can extract. I might also give your package a glance for AR models (but secretly wish I won’t have to…).
The periodigram tells you the frequency contents of your signal. The peak at 0 corresponds to the mean value and can probably be ignored. If you plot the log magnitude it usually gets easier to interpret. Otherwise your interpretation is correct. You can provide the sampling frequency to the periodigram function to have the frequency axis scaled appropriately.
Have you solved your problem? I have a very similar signal as yours which I would like to automatically determine whether it is oscillatory or not. I did perform DSP.periodogram and obtain sharp peaks at several freqs. But how can code recognize such characteristics, i.e. tell apart from smooth or noisy or broad peaks from non-periodic (oscillatory) signals?
I did some testing with the periodogram. Basically, I found the maximum value in it, and by some testing, it turned out that an arbitrary threshold for maximum value in the periodogram was good enough for my applications.
Usually we know something about the signals we study: what is the generating source bandwidth (in Hz for example), what are the receiver/transducer characteristics, what was the background noise spectra recorded when the signal was not present, is noise also correlated with the source signal? What is the physics? Etc.
Thanks! My interests, however, is a bit different. I have a set of non-linear equations which is solved by iterative methods. It is useful to detect the occurrence of the oscillation of the residual error (which is larger than the tolerance) so that I can stop the iterative process earlier. However, the oscillation of the residual error does not happen all the time. Sometimes, the iterative solver will converge to the desired tolerance without any oscillation. Another case is that the convergence is so slow which triggers to stop the iteration even if the desired tolerance is not obtained. I want to develop an automatic algorithm to determine such cases.
Below is a sample residual error which exhibits oscillation. The desired tolerance is 1e-6. A transition zone is around n=1000. After n=2000, the residual error oscillates regularly every 5 or 6 iterations as shown in my previous post. The ideal stopping iteration number will be between 2000-2500, or even better 1000.
without knowing anything about your problem domain or solution method …
you may be working too hard – rather than go until you find oscillations, consider increasing precision (or however you refine the solution method) when the improvement over n (solution method dependent, perhaps not more than 2*number of equations/constraints) iterations is less than e.g. sqrt(eps(1.0)) or more or less (solution method dependent)
I know oscillation detection is quite problem-dependent. But I do feel in my case it is well defined. The oscillation, if it happens, shall have a mean value around 1e-3. My desired error is typical 1e-5. There are four cases: (1) converges rapidly to below 1e-5, (2) converges rapidly to some error higher than 1e-5 then much slower (but without any oscillation), (3) converges rapidly to some error higher than 1e-5 then oscillates, (4) diverges (residual error > 1 and increases).
Why not change parameters to enhance the convergence? It is because I have to run a bunch of this solution process consecutively and it is not clear how to change the parameters during the consecutive process (online).
It is a quite long story to explain all the stuff here. But I will give a brief summary of what I am doing heuristically. I have density distributions to be solved from a set of non-linear equations
where i \neq j for i, j=1, 2, \dots, n. Each \phi_j can be computed from w by solving a Fokker-Planck equation. The cell (\mathbf{r}) is fixed for each solution. But I have to use an optimization method (currently I use Optim.Brent for 1D problems and Optim.NelderMead for 2D and 3D problems) to optimize the cell size (release the internal tension of the density distribution). Therefore, I have to find a very robust algorithm for one solution of my nonlinear problem. This way, I do not have to intervene the optimization process.
The difficulty here is when the residual error oscillates, it is very hard to set a simple criterion to determine whether the error has been improved over ~200 consecutive iterations. For example, suppose the upper bound of an oscillation is 0.008, the lower bound is 0.002, and 200 can be divided by the period of the oscillation, then at iteration 1 the error is 0.008, after 200 iterations, the error reduces to 0.002, which is not the desired signal for stopping because the error is reduced here!
Therefore, it is critical to find a way to automatic detection of the oscillation.
@liuyxpp, would something simple like the following work in your problem?
NB:
your question would fit best in the Numerics domain
using Random, Distributions, Statistics
# input data (proxy to residual error above):
n = 5000
x = 1:n
e = rand(Normal(5e-3,2e-3),n)
e[e.<1e-3] .= 1e-3
y = exp.(-x*5e-3) .+ e
# compute relative variability in the residual error:
nw = 17 # adjust window half-length to problem
xx = nw+1:n-nw-1
dyy = zeros(size(xx))
for i in xx
dyy[i-nw] = std(y[i-nw:i+nw]) ./ mean(y[i-nw:i+nw])
end
Thanks! It looks like a good idea. I have plotted my variation and shown below. It seems for a good oscillation signal, the variation will reach a plateau. However, setting a threshold seems not a valid approach for the residual. My raw residual error data is here jl_k2y4bd.txt - Google Drive
@liuyxpp, by using smaller nw better results could have been obtained.
For whatever it is worth, one can further improve this by locally detrending the residual error with low order polynomials before computing the local standard deviation relative to the local mean (called here “relative variability”).
There should be a more efficient way of doing this using splines. But before investing more time, it would be best to know how numerical analysts define “oscillation” and what criteria do they use to stop.
# input data from @liuyxpp
using DelimitedFiles
y = readdlm("residual_error_non-linear_solver_oscillations.txt", Float64)[:,2]
n = 3000
y = y[1:n]
x = 1:n
using Statistics, Polynomials
# compute relative variability in the residual error:
nw = 10 # adjust window half-length to problem
xx = nw+1:n-nw-1
dyy = zeros(size(xx))
for i in xx
xi = i-nw:i+nw
yi = y[xi]
mi = mean(yi)
p = Polynomials.fit(xi,yi,3)
c = coeffs(p)
yi = yi .- p.(xi) .+ mi # detrend the data (cubic), keep local mean
dyy[i-nw] = std(yi) ./ mi
end
@liuyxpp, another idea, even simpler, is to count the number of local oscillations around the local mean:
using DelimitedFiles
# input data from @liuyxpp google drive above
y = readdlm("residual_error_non-linear_solver_oscillations.txt", Float64)[:,2]
n = 3000
y = y[1:n]
x = 1:n
# compute number of different times that data is above or below local mean
nw = 100 # adjust window half-length to problem
xx = nw+1:n-nw-1
dyy = zeros(size(xx))
for i in xx
xi = i-nw:i+nw
yi = y[xi]
mi = mean(yi)
yi = sign.(yi .- mi)
yi = [yi[1]; yi[2:end][yi[2:end] .!= yi[1:end-1]]] # remove consecutive repetitions, by @niczky12
dyy[i-nw] = length(yi)
end
Thank you very much! These two days I am reading references of automatic oscillation detection algorithms mainly in the field of control loops. I have tried several approaches. But it seems they all have some shortages. For example, this paper is particularly interesting Automatic oscillations detection and quantification in process control loops using linear predictive coding - ScienceDirect. It uses linear predictive coding to identify the oscillation. When any root of the prediction polynomial A(z) is close to the unit circle within a certain threshold, the signal is oscillatory. However, this approach falsely identifies any smooth decaying signal (such as x^{-1/2}) as oscillation.
I think your second approach by counting the number of local oscillations around the local mean closely relates to the ACF approach. This number reflects the period of the oscillation, which is also the difference between two consecutive zero-crossing points of the ACF.