I’m having a hard time trying to understand why in the following code a closure is slower than an equivalent hand-made callable struct.
using BenchmarkTools
using Interpolations
using LinearAlgebra: norm
using StaticArrays
using QuadGK
speed(spline, t) = norm(Interpolations.gradient1(spline, t))
# closure version
length_closure(spline) = quadgk(t -> speed(spline, t), 0, length(spline))
# hand-made struct version
struct LenIntegrand{S}
spline::S
end
(li::LenIntegrand)(t) = speed(li.spline, t)
length_struct(spline) = quadgk(LenIntegrand(spline), 0, length(spline))
# benchmarking code
θs = range(0, 2π, length=25)[1:end-1]
xs, ys = 2cos.(θs), 0.5sin.(θs)
vec = [SA[x,y] for (x,y) in zip(xs, ys)]
spl = extrapolate(interpolate(vec, BSpline(Cubic(Periodic(OnCell())))), Periodic())
@benchmark length_closure($spl) # ~ 65 μs
@benchmark length_struct($spl) # ~ 45 μs
I can observe the same behavior in other similar examples, where a hand-made struct always beats a closure for the purpose of fixing some arguments, whereas I expected the two implementation to be pretty much equivalent. @code_warntype does not seem to help me understand the underlying issue here.