How to fit a polynomial with a pre-determined degree to a set of points?

Hi, all

I want to approximate a function to a vector v of 99 values so I can evaluate that function over other values. This can be a simple linear interpolation of the 99 values, but I’d fit a polynomial in case other values don’t fit within the range of the 99 values.

Initially, I thought of using Polynomials.jl to apply the function fromroots as shown in the Quick Start examples of the package’s page.

using Polynomials
poly = fromroots(v)
poly(1250) # This could be any other number from 0 to 100,000

But the resulting polynomial is rife with monomials of powers that go as high as 99. Many of these monomials have Inf32 for coefficients. The evaluation step didn’t work. I did not find a way to limit the maximum degree to the polynomial. A degree of 10, for instance, would be just fine.

Then, thanks an example from the Julia Discord, I found a different way I attempt to implement

using Polynomials
fit(x,v, 10)

Here, x has the same length as v. However, to my surprise, I got the following error message:

ERROR: UndefVarError: fit not defined

I have the Distributions.jl also in use and I wonder if this is somehow the reason why fit won’t work (Distributions.jl also has a function called fit). I don’t know how to specify the package for a function. In R, I would’ve done Polynomials::fit, but in Julia that’s no kosher. I searched online for a way to specify the package, but given the results I got, I’m sure I am not framing my search correctly.

Lastly, I learned about ApproxFun.jl. The package has a brief section on “Using ApproxFun for “manual” interpolation”, which seemed just what I have been looking for. However, to achieve such manual interpolation, I must pass a function f and a domain to the function Fun. The problem is that I’m looking for a function f so I can’t provide it.

Does anyone know how I find a polynomial of a degree D to a set of N numbers so I can use the polynomial to be evaluate at other input values?

# This simple example will reproduce the error I'm having using the function fit.
using Polynomials
x = 0.01:0.01:0.99
y = (x.^2).*5 - x.*2 .+ 6
fit(x, y, 10)

Well… Just after hours and hours of trying to find a solution, I just learned that using was pretty much all I needed.

If you know, however, how to approximate a function by interpolating over a set of values using the package ApproxFun, please let me know.

The thing is that ApproxFun needs to know the values of f at specific values of x. For a Chebyshev approximation, if you happen to know the function values at the Chebyshev nodes there’s no problem:

using CairoMakie
using ApproxFun

f(x) = sin(2*pi*x)

# Get 30 Chebyshev nodes in the interval [0, 5]
# (these nodes are not evenly spaced)
S = Chebyshev(0..5)
x = points(S, 30)

# Suppose we know the value of f at these points
y = f.(x)

# We can calculate the corresponding Chebyshev coefficients
c = transform(S, y)
# And build the approximation of f
fa = Fun(S, c)

fig = plot(x, y, figure=(; resolution=(500, 200)))
plot!(domain(S), fa)


More probably, you have data points on an even grid or a random grid. You can still find the coefficients of the Chebyshev polynomials that go through these points using linear regression as shown in the FAQ:

# Points on a regular grid
x = range(0, 5, length=30)
y = f.(x)

# Function space for approximations on the given interval
S = Chebyshev(x[begin]..x[end])

# Build design matrix for linear regression by evaluating each
# polynomial at the given x values
X = hcat((Fun(S, [zeros(k-1); 1]).(x) for k in 1:length(x))...)

# Use linear regression to find the Chebyshev coefficients
c = X\y
fa = Fun(S, c)

fig = plot(x, y, figure=(; resolution=(500, 200)))
plot!(domain(S), fa)


Though according to the FAQ for numberical stability it’s better to use an overdetermined system (i.e. replace length(x) by a smaller number in the code above).

Another option proposed here is to make a first approximation with the AAA algorithm and use the result (a rational function) to calculate values at the Chebyshev nodes:

using BaryRational

# Using >30 points to avoid Froissart doublets (poles near zeros)
x = range(0, 5, length=50)
y = f.(x)

# Rational approximation
f_aaa = aaa(x, y)

# Approximations of f at the Chebyshev nodes (we choose n=100)
x_cheb = points(S, 100)
y_cheb = f_aaa.(x_cheb)

# Chebyshev approximation of f
c = transform(S, y_cheb)
fa = Fun(S, c)

fig = plot(x, y, figure=(; resolution=(500, 200)))
plot!(0..5, fa)



FastChebInterp.jl can do Chebyshev interpolation (like ApproxFun, but in arbitrary dimensions) from Chebyshev points, or Chebyshev regression (least-square fitting to Chebyshev polynomials) from an arbitrary set of points.

You can also do such regression from arbitrary points with ApproxFun — there is an explanation in the ApproxFun FAQ. Basically you just set up the Chebyshev–Vandermonde matrix and do a least-square fit.

However, if you are doing high-degree polynomial fitting from arbitrary points, you definitely want to choose a degree much smaller than your number of points, as otherwise you can run into a Runge phenomenon where the polynomial interpolation diverges exponentially between the points.

(In contrast, interpolating from Chebyshev points is well behaved up to arbitrarily high degree, if you implement the interpolation properly. If you have the freedom to choose your evaluation points, this option is highly recommended!)

(Working with polynomials a sum of monomials is very badly behaved at high degrees; Chebyshev polynomials are nicer because they stay between -1 and +1. Monomials should be typically fine up to degree 10, though.)

1 Like

I had a similar problem thirty years ago. I wanted to smooth lines in GIS in order to present a map at a higher scale (say from 1:25 000 to 1:100 000). Since the lines around the polygons had too many values I could not really use original coordinates.

I used a Lagrange interpolation with N points first and looked for a Lagrange approximation of lesser degree with K points (K < N) that would minimize the sum of square differences between the two (least-square).

Letting N increase to infinity, while keeping the ratio K/N at a fixed value (like 10%), I found that a low-pass filter was the resulting formula.

I would have to redo the math. It involde approximating sin(x)/x by an infinite product. The Lagrange approximation coefficients are products to cancel out at each point.

Is that what you’re looking for?

Thanks for your the well detailed answer, @stevengj. You’ve provided me with some great food for thought.

@dgagnon, I didn’t quite follow your answer and that’s on me — I’m not that well versed in stable methods for high-degree polynomial interpolation. I will take a closer look at your suggestion. Thank you.

Thank you, @sijo, so much for taking the time to craft such a detailed response.

1 Like