Best way to take derivatives of unevenly spaced data (with interpolations? discrete derivatives?)

@dlfivefifty, you recommend computing the norm (or root mean square errors) for f, df/dt and d2f/dt2? It might well happen that some methods work best for some type of functions and not for others. How do mathematicians bring order to such a wide range of possibilities?

Yea. This would be a Sobolev norm.

Generally speaking if you know the regularity (ie differentiability) of the function there’s precise results on the convergence rates in Sobolev norms of various methods. Apart from perhaps AAA. Though uneven grids add an interesting wrinkle.

The mathematical areas studying this is “Approximation Theory” and there’s J. Approximation Theory and Constructive Approximation which publish papers on the topic. Mike Powell also has a nice though old book.

Edit: and of course Trefethen’s Approximation Theory and Practice for Chebyshev based methods

7 Likes

Just to add here that there is a mistake in my example above in that I used a function defined on [0,1] to fit to a ApproxFun Fun defined on [-1, 1] (oops). I fixed that mistake and also added a comparison the the “exact” integral (starting with a Fun that had adaptively decided how many points to sample). That is now in a IJulia notebook in the ApproxFunExamples repo. You can best see it using the nbviewer here Jupyter Notebook Viewer

3 Likes

When fitting curves piecewise with splines of degree d, expect the fit at the initial d xs and the final d xs to be unsatisfactory. The closer to the first x or the final x, the less likely it is that the y will be fit robustly. Additionally, larger d asks to use higher order polynomials, which may not be what you want. Usually, d is one of (2,3,4,5).

If there is a reasonable way to extrapolate a few values before the first of and after the last of the available data, adjoining them often improves the behavior of the fit for the first and final data points. Ideally, you would prepend d+1 and postpend d+1 extrapolands, fit the piecewise spline, and avoid using all [most] of the artificial information.

To extend the data by the least difficult approach, copy the first and the last points. It is better to use a three-point or four-point extrapolation, extending the x proportionately to its nearest relative distances.

3 Likes

So, I realized that I was anchoring the initial and final limits of my integration (of my key function RCurv) to the randomly selected time points, which is why each time I ran the program the range of numerical integration was slightly different, so the result was slightly different.

If I keep the limits of integration fixed to be the same each time, then I always get a similar final number, and it is very close to the right theoretical number (for the example function I’m using here). The calculation (adapted from jipolanco’s example) works well as long as the Spline Order is at least 4.

As suggested by JeffreySarnoff, I suppose it’s safe to go up to d=5 to be sure, and maybe cut off the regions of the interpolation containing my first and last 6 data points, as being “unreliable” parts of the spline fits.

I think I have a working solution here guys, thanks for your suggestions everyone!

2 Likes

The function RCurv(t) has a division by 0 around t0 = 1.20608 and it complicates matters to integrate it starting at t = 1.
In the revised code above, the analyses started at t = 1.21. Used splines of degree 4 (thanks @jipolanco for the correction) for both BSplineKit and Dierckx, and ApproxFun.jl for the integration of RCurv(t) as it seems to produce best results near t0.

1 Like

That looks great!

Just a note for fairer comparison between spline packages. There are two different conventions for the definition of the spline order. The one used in BSplineKit is the same as in classical textbooks (e.g. de Boor) and is also the one used in Wikipedia. Namely, splines of order k have a polynomial degree k - 1. In particular, cubic splines have order 4.

Dierckx seems to use the other (also reasonable) convention, in which the spline order is the same as the polynomial degree.

In short, order n in Dierckx corresponds to order n + 1 in BSplineKit.

1 Like

While I have you guys here, once I have a spline representation of a fit to a data set, what’s the easiest way to find the locations & values of any maxima & minima of the function?

Example partial code:

Sfn = spline(BSplineKit.interpolate(timeDatArr, FunctionDatArr, BSplineOrder(5)))
Sd1 = diff(Sfn, Derivative(1)
Sd2 = diff(Sd1, Derivative(1))

How to find locations of zeros of Sd1, and the values of Sfn there?

Thanks!

You could use Roots.jl to find zeros of the first derivative of the spline.

Here is a quick example using Newton’s method:

using BSplineKit
using Roots

f(x) = cos(4π * x) * exp(-4x)

xs = 0:0.05:1
S = spline(interpolate(xs, f.(xs), BSplineOrder(6)))

S′ = diff(S, Derivative(1))
S″ = diff(S, Derivative(2))

x₀ = 0.5  # initial guess for Newton method
x = find_zero((S′, S″), x₀, Roots.Newton())  # 0.47547599925831957
S(x)   # 0.1422509042378483
S′(x)  # -1.5265566588595902e-16
2 Likes

IntervalRootFinding.jl would be ideal to output all the extrema:

using BSplineKit, IntervalArithmetic, IntervalRootFinding, Plots

f(x) = cos(4*pi*x) * exp(-x)
f′(x) = -exp(-x) * (4*pi*sin(4*pi*x) + cos(4*pi*x))

a, b = -1, 2.5
x = a : 0.02 : b
S = spline(interpolate(x, f.(x), BSplineOrder(6)))
S′ = diff(S, BSplineKit.Derivative(1))

plot(x, f.(x), label="f(x)", lw=2)
plot!(x, S.(x), label="S(x)", ls=:dash)
rint = roots(f′, a..b)      # intervals with roots solutions
rts = mid.(interval.(rint)) # roots taken at each interval midpoint
scatter!(rts, f.(rts), label = "IntervalRootFinding for f′(x)", mc=:red,ms=5)
scatter!(rts, S.(rts), label="BSplineKit S(x) where f′(x)=0", mc=:cyan, ms=3)

IntervalRootFinding

However, it does not seem straightforward to have it consuming the BSplineKit spline functions:

S1d(x) = S′(x)
rint = roots(S1d, a..b)   # intervals with roots
ERROR: TypeError: non-boolean (Interval{Float64}) used in boolean context
2 Likes

Yes, unfortunately interval methods cannot handle functions containing ifs.
You would have to extract and pass each separate piece of the spline to it.

If you use cubic splines then the extrema are given by quadratic equations, so it should be easy (?) to solve (again, on each segment separately).

ApproxFun can also easily do these extrema calculations.

3 Likes

The problem of finding all extrema in the analyses interval can be solved with Dierckx.jl for cubic splines.

using Dierckx, Plots

f(x) = cos(4*pi*x) * exp(-x)
f′(x) = -exp(-x) * (4*pi*sin(4*pi*x) + cos(4*pi*x))

a, b = -1, 2.5
x = a : 0.02 : b
spl = Spline1D(x, f.(x), k=4)  # spline of degree 4
df1 = derivative(spl, x)  # 1st derivative
spl1 = Spline1D(x, df1, k=3)   # derivative of deg=3 of spline of deg=4
rts = roots(spl1, maxn=20)

plot(x, f.(x), label="f(x)", lw=3, lc=:cyan)
plot!(x, spl(x), label="Dierckx S(x) of 4th degree", ls=:dash,lc=:blue)
scatter!(rts, f.(rts), label = "f(x) for x in Dierckx S′(x)=0", mc=:red,ms=5)
scatter!(rts, spl(rts), label="Dierckx S(x) where S′(x)=0", mc=:yellow, ms=3)

Dierck_extrema

3 Likes

Hi guys, taking a combination of suggestions from above, I was able to successfully find the extrema (minima/maxima) of my data – which is noisier & rougher than any smooth function – through a repeated combination:

Smoothing the data by taking a rolling average with rollmean (using RollingFunctions); Splining the smoothed data with spline(BSplineKit.interpolate and differentiating the Spline with diff (using BSplineKit); taking another rolling average to smooth the derivative; Splining the smoothed derivative, and finding its zeros (locating the extrema) with find_zeros (using Roots).

A complicated process, but it seems to work; thanks for your suggestions everybody!

According to the Numerical Recipes, when we are fitting data to a parametric model, it is almost always better to use raw data than to use data that have been pre-processed by a smoothing procedure. It would be best to fit a physics derived model to the experimental data, if applicable.

Hi Raphael, I would like to know more about that proscription, why & when it’s applicable.

But the original “data” I have is already very pre-processed and choppy (taken from an irregularly-sampled, step-wise discrete process), so doing anything to it without smoothing first seems like a lost cause.

Thanks for any further info.

You can do a fit without smoothing first — the fit itself is a smoothing procedure.

4 Likes

This post is excellent for ideas to try a number of smoothing spline approaches.

1 Like