Given some spline (e.g. Dierckx.jl, Interpolations.jl), is there a good way to get its arc length?
Right now I’m using this:
using LinearAlgebra, Dierckx
t = range(0, stop = π, length = 100)
x = cos.(t)
y = sin.(t)
spl = ParametricSpline(t, hcat(x, y)')
tl = range(0, stop = π, length = 10000)
sum(norm(derivative(spl, i))*step(tl) for i in tl) # should be π
Arc length of a cubic is an elliptic integral, so there is no nice closed-form expression for it. The best way is to use numerical integration. The function is only smooth in between the knots, so you will want to use numerical integration divided into intervals at each knot.
For information, a direct calculation of the distances between successive interpolated points provides here a much more accurate estimate of the total length of the arc than the summation of the norms of the interpolated derivatives.
using LinearAlgebra, Dierckx
t = LinRange(0, π, 100)
x, y = cos.(t), sin.(t)
spl = ParametricSpline(t, hcat(x, y)')
# Integral of norm of derivatives:
tl = LinRange(0, π, 10_000)
π1 = sum(norm.(eachcol(derivative(spl,tl)))*step(tl))
# Sum of the distances:
xy = spl(tl)
Δxy = xy[:,2:end] - xy[:,1:end-1]
π2 = sum(norm.(eachcol(Δxy)))
# π1 = 3.141906841...
# π2 = 3.141592637...
# π = 3.1415926536...
However, in terms of execution time, it may not be that much better?
The simple method above seems to get the arc length π2 with ~1e-7 accuracy, much faster (x200?) than using QuadGK.
Could you confirm?
Thanks for the review.
In my PC method2 is even 60 times faster.
However, if all we need is a relative accuracy (res - pi)/pi < 0.0002, then we can run method1 with 50 points tl = LinRange(0, π, 50) and be almost twice as fast.
You should really break up the integral at the knots of the spline, otherwise quadgk could waste a lot of time trying to work out the discontinuities (the spline is continuous, but some higher derivative is not).
Even faster would be to just use a fixed-order Gaussian quadrature rule on each sub-interval, since the spline is piecewise polynomial.
function method2a()
knots = Dierckx.get_knots(spl)
s = 0.0
for (k1, k2) in zip(knots[1:end-1], knots[2:end])
res, _ = quadgk(t -> norm(derivative(spl, t)), k1, k2)
s += res
end
return s - π
end
Seems worse:
julia> @btime method1a()
2.838 ms (44232 allocations: 2.43 MiB)
-3.4626905787149553e-9
Even faster would be to just use a fixed-order Gaussian quadrature rule on each sub-interval
ooo, I don’t know how to do that yet, but sounds good…
Note that you an pass a rtol = 0.0002 to quadgk for this as well.
It is possible to pass all the knots to QuadGK to use in a single call with a pre-allocated buffer, though it’s still probably better to just use a fixed-order rule, which can be evaluated allocation-free.
However, I’m concerned that your test data, formed by just uniformly sampling a semicircle, may be deceiving you, because it is too smooth. If you call quadgk_count to see the evaluation points, you’ll see that it is doing the whole integral with just a single 15-point Gaussian quadrature rule.
If you really have data that smooth, you probably shouldn’t be using splines at all — you should use something like Chebyshev polynomial interpolation — I’m assuming that such smooth data must be synthetically generated (i.e. be the output of a computer program) so that you can choose your x points.
Whereas if you are using a spline for something like experimental data with noise, the kinks at the knots have a much more noticeable effect. For example, if you generate your data with x, y = cos.(t), sin.(t) .+ randn.() .* 0.01, then quadgk requires 13245 evaluations (for the default 1e-8 tolerance).
indeed the semicircle was just for testing. My real data is smoothing a track made of noisy coordinates. All be it, the resulting smoothed track is a rather smooth spline (but not a semi circle).
function method3(spl, x_gauss, w_gauss)
knots = Dierckx.get_knots(spl)
integral = sum(eachindex(knots)[2:end]) do i
t1, t2 = knots[i-1], knots[i]
scale = (t2 - t1) / 2
sum(zip(x_gauss, w_gauss)) do ((xg, wg))
t = (xg + 1) * scale + t1 # map from (-1,1) to (t1,t2)
norm(derivative(spl, t)) * wg
end * scale
end
end
using QuadGK
method3(spl, gauss(15)...)
Since norm of the derivative is not a polynomial, even though the spline itself is, it’s not quite as good as I initially hoped.
I would benchmark on your actual data, since the performance of adaptive quadrature will be sensitive to this noise. (You can measure accuracy compared to a brute-force quadgk call with a low tolerance.)
All splines, by definition, “look” smooth because they are constructed to have some number of continuous derivatives, but high-order quadrature methods implicitly assume even more derivatives. (quadgk defaults to order=7, but you can change this.)
I would interpolate $spl (to avoid dynamic dispatch) and $(gauss(15)) (to avoid timing the construction of the Gauss rule). (Ideally method3 should be allocation-free, at least if you pass in the knots?)
Also, you need to control your comparison so that you are computing the integral to the same accuracy in all cases. (e.g. play with the order of the gauss rule, the rtol parameter for quadgk, etcetera). Otherwise you are comparing apples and oranges.
(You should be able use method4 with e.g. order=7, rtol=1e-13 as your “exact” result to check the accuracy of the other methods.)
It looks like all of these methods are being hurt on performance by the fact that Dierckx.derivative(spl, t) is (i) allocating (it returns a Vector{Float64}, whereas it really should use SVector{2,Float64}) and (ii) is type-unstable (its return type is inferred as Any).