Differentiating through Chebyshev coeff basis

I asked a similar question in slack before and now I have formulated my question much better. Maybe a related issue here chainrules for chebpoly constructors · Issue #10 · JuliaMath/FastChebInterp.jl · GitHub

Let’a say I have a function like relu which I approximated via chebyshev interpolation for [-1; 1]

using FastChebInterp

x = chebpoints(200, -1, 1)
c = chebinterp(relu.(x), -1, 1)

The problem is that I need the derivative in terms of the coefficients of the approximation, which is not what the package provides chebgradient(c, x)

I tried making my own version as follows


function mychebyshevt(n, x)
    if n == 0
        return 1.0
    elseif n == 1
        return x
    else
        return 2x * mychebyshevt(n - 1, x) - mychebyshevt(n - 2, x)
    end
end
function chebyshev(coeffs, x)
    ans = 0.0
    for (i, a) in enumerate(coeffs)
        n = i - 1
        ans += a * mychebyshevt(n, x)
    end
    return ans
end

t = randn(4)
x = rand()

@show ChebyshevT(t)(x) ≈ chebyshev(t, x)
dt, dx = gradient(chebyshev, t, x)
# dt is what I need

The problem is that this is awfully slow, and any tips to make this faster would be appreciated. I think the way I’m performing the evaluation of the poolynomial is too naive and inefficient

Generally with Chebyshev polynomials you want to use an FFT to convert between gridpoint values and polynomial coefficients, and to do differentiation as a map between sets of polynomial coefficients. Have a look at GitHub - JuliaApproximation/ApproxFun.jl: Julia package for function approximation

My coefficents should be in Chebyshev basis though. I want

f(x) \approx \sum_{n} a_n T_n(x)

and then differentiate df / da

Isn’t \frac{\mathrm{d}f}{\mathrm{d}a_i}(x) just T_i(x)?

3 Likes

If you just want the values T_n(x) for a given x and n = 0,1,\ldots,m, you can do it by calling

FastChebInterp.chebvandermonde([x], -1, +1, m)[1,:]

since chebvandermonde takes a length n array of -1 ≤ x[k] ≤ +1 points and returns the n \times (m+1) “Chebyshev–Vandermonde” matrix:

\begin{pmatrix} T_0(x[1]) & T_1(x[1]) & \cdots & T_m(x[1]) \\ T_0(x[2]) & T_1(x[2]) & \cdots & T_m(x[2]) \\ \vdots & \vdots & \ddots & \vdots \\ T_0(x[n]) & T_1(x[n]) & \cdots & T_m(x[n]) \\ \end{pmatrix}

(This is done by a recurrence, so it has linear \Theta(mn) cost. You can also pass lb, ub instead of -1, +1 and it will rescale the coordinates from [lb, ub] to [-1,+1] for you.)

1 Like