Multivariate polynomial regression of discrete data in L-infinity norm

Hi everyone,

Does anyone know of a Julia package that implements polynomial approximations (preferably Chebyshev polynomials) given data on a multivariate function with a L-infinity norm?

Specifically, I have x, f(x) pairs and I aim to fit a polynomial that minimizes the maximum error at the data points. It needs to be a polynomial because I need its partial derivatives but it can either be an interpolation or a regression approach (I do not need the approximation to go through all data points).

FastChebInterp.jl does almost what I want but its interpolation command (chebinterp()) needs Chebychev points (which I do not have) and its regression command chebregression() uses a L-2 norm.

Any ideas are welcome!

1 Like

Remez.jl does what you’re looking for (although it returns BigFloats in the monomial basis)

1 Like

Thanks! This is a great start but it seems to only accept one-dimensional functions, right?

Just wanted to point out Chebyshev can mean two different things in this context which might mean people misunderstand your question: (1) the Chebyshev polynomials cos(k*acos(x)) and (2) Chebyshev approximation, which is the best L^∞ fit and the error equi-oscillates.

3 Likes

What dimensionality do you have in mind? This is a hard problem I think. Can you point to a package for some other language that does what you want?

2 Likes

If I understand correctly what you are doing, what you want is a lot easier than the question the other posters seem to think they are answering.

Packages like Remez.jl are solving the continuous minimax problem in which one wants to minimize \Vert p(x) - f(x) \Vert_\infty over a continuous interval x \in [a,b], whereas it sounds like you only want to minimize the L_\infty error over a discrete set of data points.

In particular, suppose you have a set of m data points \{ x_k, y_k \}_{k = 1,\ldots, m} and you want to find a degree-n polynomial p(x) that minimizes the L_\infty error at those points:

\min_p \left[ \max_k |p(x_k) - y_k| \right]

This problem is straightforward to solve because it is equivalent to linear programming (LP). In particular, the “epigraph” reformulation into an LP works by adding an additional real parameter t \in \mathbb{R}

\min_{p \in \text{polynomials}, t\in \mathbb{R}} t \\ \text{s.t. }\qquad t \ge p(x_k) - y_k, \; t \ge y_k - p(x_k) \qquad \text{for } k = 1,\ldots,m

(This is an LP because p(x_k) is a linear function of the polynomial coefficients in whatever basis. That’s true regardless of the dimensionality of x.)

So, you can use any LP solver in one of Julia’s many convex-optimization packages to solve this.

PS. People may have been confused by original the title of your post, “Function Approximation in L-infinity Norm”, whereas you are really approximating data (i.e. doing regression, i.e. discrete domain), not functions on a continuous domain. I’ve edited the thread title accordingly.

6 Likes

More concisely, if the m \times (n+1)^d matrix V is the Vandermonde matrix of your polynomial basis (of degree n along d dimensions) for the points x_k, then what you want to do is to solve for the coefficients c using:

\min_{c \in \mathbb{R}^{(n+1)^d}} \Vert Vc - y \Vert_\infty

which is a convex optimization problem that you should be able to give almost directly to JuMP.jl or similar, something like:

using JuMP, MathOptInterface
function solve_for_coefs(V, y)
    N = size(V,2)
    @variable(model, coefs[1:N])
    @variable(model, t)
    @constraint(model, [t; V*coefs - y] in MathOptInterface.NormInfinityCone(1 + length(y)))
    @objective(model, Min, t)
    optimize!(model)
    return value.(coefs)
end

In the Chebyshev-polynomial basis, V can be computed by the function FastChebInterp.chebvandermonde(x, lb, ub, order) where x is a length-m array of d-dimensional SVectors x[k], lb and ub are SVectors of the lower and upper bounds you want to use for your Chebyshev polynomials (a box containing the x[k]), and order = (n,n,....) is a tuple of the polynomial degrees along each dimension. From this, you can construct a ChebPoly object p from the coefficients, so that you can then evaluate p(x) (and its derivatives) at arbitrary points x. Something like:

V = FastChebInterp.chebvandermonde(x, lb, ub, order)
coefs = solve_for_coefs(V, y)
p = FastChebInterp.ChebPoly(reshape(coefs, order .+ 1), lb, ub)

(I haven’t tested this code, but it should give the general idea.)

5 Likes

There are some alternative, possibly more efficient, LP formulations in my package FindMinimaxPolynomial.jl. The package itself currently only supports univariate approximations, but the LP formulations could be useful as an inspiration.

2 Likes

It looks like that package is also for the continuous minimax problem, not the discrete one? I guess somewhere inside the code it is solving a sequence of discrete minimax problems at samples?

2 Likes

Yes. The PolynomialPassingThroughIntervals module should have an implementation for the discrete case, using which the continuous case is then attacked. This is an iterative Remez-inspired process, calling a MathOptInterface.jl LP solver multiple times.

Regarding the linear programming formulations: I took a short look at the code now, and the only smart LP formulation is what I termed ReplacedVariablesOption{:determinants} in the code, mostly implemented in the OverdeterminedSystemConsistencyConstraints module. I forgot the details (although I have something written down on paper somewhere), but it seems the idea is to take the Vandermonde matrix, which might be very tall and thin, and transform it into an equivalent system of constraints with much less rows.

BTW this was/is just a toy hobby project of mine, in case it’s not clear.

3 Likes

Thanks. I meant an approximation by a sum of (Chebyshev) Polynomials using the L-infinity norm as a goodness of fit measure.

It should not get to more than 100 dimension. I did not find a package in another language but I also did not search around too much outside of Julia.

1 Like

Thanks for the answer and for adjusting the title. This is for a large part what I was looking for.

Just to clarify: I aim to approximate a function on a continuous domain [a,b]. Or in other words, I have a set of points x and the function value f(x) only at those points. I cannot get more evaluations at other points. However, I do care about the quality of the approximation on the whole interval (e.g. to do simulations on other data). I understand that there is no way to test the goodness of fit on other points than those I have data on but that is where I believe approximation theory helps me.

Now, what I really care about is the partial derivative of the function. Do you have an idea on how I could get an approximation of f(x) that also best approximates \frac{\partial f(x)}{\partial x} while only having data on x and f(x)? And if so, how an approximation error in f(x) affects the error in the approximation of \frac{\partial f(x)}{\partial x}?

Thanks! This helps already a lot!

1 Like

Awesome, thanks! I will check your package out. Maybe, I can use parts of it!

1 Like

Where are the points coming from? Experiments? Do they contain measurement noise?

1 Like

I am developing a method that should be applicable in different scenarios. So, the data could come from different sources (reports, interview, etc). It is likely that both the x and f(x) contain measurement noise. But if this complicates the approximation substantially, assuming no measurement noise would be okay to start with.

My understanding is that L_\infty regression is only better than L_2 for certain types of noise? e.g. uniform noise within bounded support (see e.g. Yi and Neykov, 2021), as opposed to Gaussian noise that might be more common in experimental data? What led you to L_\infty in your case?

Anyway, I’m not aware of any approximation methods that specifically target the derivative (as opposed to trying to approximate a function and getting the derivative as a side-effect), though maybe someone else on here knows of something.

Instead of doing a global polynomial fit, you might also consider some kind of local smoothing/interpolation, e.g. a smoothing spline. Depends on how much you know about your function, I guess.

3 Likes