Arc length for a spline

Given some spline (e.g. Dierckx.jl, Interpolations.jl), is there a good way to get its arc length?
Right now I’m using this:

using LinearAlgebra, Dierckx
t = range(0, stop = π, length = 100)
x = cos.(t)
y = sin.(t)
spl = ParametricSpline(t, hcat(x, y)')
tl = range(0, stop = π, length = 10000)
sum(norm(derivative(spl, i))*step(tl) for i in tl) # should be π

Just pinging this cause I have a feeling that posting things late CET gets swallowed by newer, more timely, posts…

Arc length of a cubic is an elliptic integral, so there is no nice closed-form expression for it. The best way is to use numerical integration. The function is only smooth in between the knots, so you will want to use numerical integration divided into intervals at each knot.

3 Likes

Wow, ok. I naively thought this was “super easy, barely an inconvenience™”. Ok… Thanks!!!

For information, a direct calculation of the distances between successive interpolated points provides here a much more accurate estimate of the total length of the arc than the summation of the norms of the interpolated derivatives.

using LinearAlgebra, Dierckx
t = LinRange(0, π, 100)
x, y = cos.(t), sin.(t)
spl = ParametricSpline(t, hcat(x, y)')

# Integral of norm of derivatives:
tl = LinRange(0, π, 10_000)
π1 = sum(norm.(eachcol(derivative(spl,tl)))*step(tl))
# Sum of the distances:
xy = spl(tl)
Δxy = xy[:,2:end] - xy[:,1:end-1]
π2 = sum(norm.(eachcol(Δxy)))

# π1 = 3.141906841...
# π2 = 3.141592637...
# π  = 3.1415926536...

I’ve since resorted to using numerical integration:

using QuadGK
π3, _ = quadgk(t -> norm(derivative(spl, t)), 0, π)
(3.1415926535893455, 1.318944953254686e-13)

fast and most accurate :slight_smile:

2 Likes

@yakir12, yes that is much more accurate! Thanks :slight_smile:

However, in terms of execution time, it may not be that much better?
The simple method above seems to get the arc length π2 with ~1e-7 accuracy, much faster (x200?) than using QuadGK.
Could you confirm?

I came back to this and found out I never got back to you!
Here’s what I find now:

t = range(0, π, 100)
xy = reverse.(sincos.(t)) 
spl = ParametricSpline(t, stack(xy))

function method1()
    tl = LinRange(0, π, 10_000)
    xy = spl(tl)
    Δxy = xy[:,2:end] - xy[:,1:end-1]
    abs(sum(norm.(eachcol(Δxy))) - π)
end

function method2()
    res, _ = quadgk(t -> norm(derivative(spl, t)), 0, π)
    abs(res - π)
end

results in

julia> @btime method1()
  785.706 μs (59 allocations: 939.88 KiB)
1.6384650081135987e-8

julia> @btime method2()
  27.873 μs (450 allocations: 25.39 KiB)
1.6921193335406315e-9

So method2 (i.e. using QuadGK) is ~30 times faster and ~10 times more accurate.

4 Likes

Thanks for the review.
In my PC method2 is even 60 times faster.

However, if all we need is a relative accuracy (res - pi)/pi < 0.0002, then we can run method1 with 50 points tl = LinRange(0, π, 50) and be almost twice as fast.

3 Likes

You should really break up the integral at the knots of the spline, otherwise quadgk could waste a lot of time trying to work out the discontinuities (the spline is continuous, but some higher derivative is not).

Even faster would be to just use a fixed-order Gaussian quadrature rule on each sub-interval, since the spline is piecewise polynomial.

Like this?

function method2a()
    knots = Dierckx.get_knots(spl)
    s = 0.0
    for (k1, k2) in zip(knots[1:end-1], knots[2:end])
        res, _ = quadgk(t -> norm(derivative(spl, t)), k1, k2)
        s += res
    end
    return s - π
end

Seems worse:

julia> @btime method1a()
  2.838 ms (44232 allocations: 2.43 MiB)
-3.4626905787149553e-9

Even faster would be to just use a fixed-order Gaussian quadrature rule on each sub-interval

ooo, I don’t know how to do that yet, but sounds good…

Note that you an pass a rtol = 0.0002 to quadgk for this as well.

It is possible to pass all the knots to QuadGK to use in a single call with a pre-allocated buffer, though it’s still probably better to just use a fixed-order rule, which can be evaluated allocation-free.

However, I’m concerned that your test data, formed by just uniformly sampling a semicircle, may be deceiving you, because it is too smooth. If you call quadgk_count to see the evaluation points, you’ll see that it is doing the whole integral with just a single 15-point Gaussian quadrature rule.

If you really have data that smooth, you probably shouldn’t be using splines at all — you should use something like Chebyshev polynomial interpolation — I’m assuming that such smooth data must be synthetically generated (i.e. be the output of a computer program) so that you can choose your x points.

Whereas if you are using a spline for something like experimental data with noise, the kinks at the knots have a much more noticeable effect. For example, if you generate your data with x, y = cos.(t), sin.(t) .+ randn.() .* 0.01, then quadgk requires 13245 evaluations (for the default 1e-8 tolerance).

1 Like

indeed the semicircle was just for testing. My real data is smoothing a track made of noisy coordinates. All be it, the resulting smoothed track is a rather smooth spline (but not a semi circle).

function method3(spl, x_gauss, w_gauss)
    knots = Dierckx.get_knots(spl)
    integral = sum(eachindex(knots)[2:end]) do i
        t1, t2 = knots[i-1], knots[i]
        scale = (t2 - t1) / 2
        sum(zip(x_gauss, w_gauss)) do ((xg, wg))
            t = (xg + 1) * scale + t1 # map from (-1,1) to (t1,t2)
            norm(derivative(spl, t)) * wg
        end * scale
    end
end

using QuadGK
method3(spl, gauss(15)...)

Since norm of the derivative is not a polynomial, even though the spline itself is, it’s not quite as good as I initially hoped.

1 Like

I would benchmark on your actual data, since the performance of adaptive quadrature will be sensitive to this noise. (You can measure accuracy compared to a brute-force quadgk call with a low tolerance.)

All splines, by definition, “look” smooth because they are constructed to have some number of continuous derivatives, but high-order quadrature methods implicitly assume even more derivatives. (quadgk defaults to order=7, but you can change this.)

1 Like

For example:

function method4(spl, segbuf; kws...)
    knots = Dierckx.get_knots(spl)
    return quadgk_count(t -> norm(derivative(spl, t)), 0, pi; eval_segbuf=segbuf, kws...)
end

knots = Dierckx.get_knots(spl)
segbuf = [QuadGK.Segment(knots[i-1], knots[i], 0.0, 0.0) for i in eachindex(knots)[2:end]]
method4(spl, segbuf, order=3, rtol=1e-3)

The spl here is from real data:

julia> @btime method1(spl, extrema(t)...) - s
  597.095 μs (61 allocations: 939.11 KiB)
-1.1757726667838142e-6

julia> @btime method2(spl, extrema(t)...) - s
  1.392 ms (50143 allocations: 1.49 MiB)
1.784385403880151e-7

julia> @btime method2a(spl) - s
  133.369 μs (4984 allocations: 151.70 KiB)
2.9988996175234206e-8

julia> @btime method3(spl, gauss(15)...)- s
  131.866 μs (3389 allocations: 106.88 KiB)
2.9981862326167175e-8

julia> @btime first(method4(spl, segbuf, extrema(t)..., order=3, rtol=1e-3)) - s
  43.651 μs (1503 allocations: 45.89 KiB)
-2.414835506669988e-6

Wow! Very cool!!!

I would interpolate $spl (to avoid dynamic dispatch) and $(gauss(15)) (to avoid timing the construction of the Gauss rule). (Ideally method3 should be allocation-free, at least if you pass in the knots?)

Also, you need to control your comparison so that you are computing the integral to the same accuracy in all cases. (e.g. play with the order of the gauss rule, the rtol parameter for quadgk, etcetera). Otherwise you are comparing apples and oranges.

(You should be able use method4 with e.g. order=7, rtol=1e-13 as your “exact” result to check the accuracy of the other methods.)

It looks like all of these methods are being hurt on performance by the fact that Dierckx.derivative(spl, t) is (i) allocating (it returns a Vector{Float64}, whereas it really should use SVector{2,Float64}) and (ii) is type-unstable (its return type is inferred as Any :cry:).

Seems like these need to be fixed in Dierckx.jl (derivative of ParametricSpline is type-unstable and allocating · Issue #103 · JuliaMath/Dierckx.jl · GitHub). But for the time being you might want to put a ::Vector{Float64} type-assert on the return value of derivative.

5 Likes

You don’t actually need to deal with Segment and segbufs manually here, do you? Can’t you just call quadgk(f, knots...; kws...)?