You have basically two nested loops: one iterating through breakpoints
and one iterating through the coefficients of each polynomial. You can improve performance if you unroll these loops in your code. This can be done with the @generated
macro.
using BenchmarkTools
struct Polynomial_Tuple{N, T <: Real}
p::NTuple{N, T}
end
(poly::Polynomial_Tuple)(x) = evalpoly(x, poly.p)
struct PiecewisePolynomial{N, P<:Polynomial_Tuple, T<:Real} <: Function
polynomials::Vector{P}
breakpoints::Vector{T}
function PiecewisePolynomial{N}(polynomials::Vector{P}, breakpoints::Vector{T}) where {N, P <: Polynomial_Tuple, T <: Real}
@assert length(polynomials) == N + 1
@assert length(breakpoints) == N
@assert issorted(breakpoints)
new{N, P, T}(polynomials, breakpoints)
end
end
(p::PiecewisePolynomial{N})(x) where N = evalpiecewise(Val(N), p.polynomials, p.breakpoints, x)
function evalpiecewise(::Val{N}, functions, breakpoints, x) where N
if @generated
generator = (:(if x < breakpoints[$k]
return functions[$k](x)
end) for k = 1 : N)
quote
@inbounds begin
$(generator...)
return functions[$(N + 1)](x)
end
end
else
ind = searchsortedfirst(breakpoints, x)
return functions[ind](x)
end
end
polynomials = [
Polynomial_Tuple((0., 0., 0.)), Polynomial_Tuple((1.0, 0.0, -1.0)), Polynomial_Tuple((-1.0, 0.0, 1.0)), Polynomial_Tuple((0., 0., 0.))
]
breakpoints = [-1.0, 0.0, 1.0]
mypiecewise = PiecewisePolynomial{3}(polynomials, breakpoints)
@btime $mypiecewise($(4rand() - 2))
I get 1.499 ns both with the code above and with your original piecewise_hardcoded
function.
Note that:
- You probably meant
x < b
in your originalevalpiecewise
function. - The
if @generated
line allows the compiler to choose whether to use the generated function or not. - The
searchsortedfirst
function is optimized to do the lookup in yourevalpiecewise
function, especially whenN
is large. - The
Val
type is used to pass the compile-time constantN
to the functionevalpiecewise
. - The
Base
functionevalpoly
implicitly unrolls the polynomial evaluation for you, since the polynomial is aTuple
. - The
ImmutablePolynomial
type of thePolynomials
package is similar to thePolynomial_Tuple
type defined here. I have defined a new type only because I wantpolynomials
to be a concrete container (in this case of quadratic polynomials), while the leading coefficient ofImmutablePolynomial
s must be nonzero. - Even though
PiecewisePolynomial
has a type parameterN
to define the number of breakpoints, I still define its fields to beVector
s. You could change that toTuple
s if you want, but this could result in large compile times ifN
is large.
This last point makes me wonder if there is a way to condition the use of a @generated
function on the value of N
(using something like if @generated() && N < 10
)…