Chebyshev Transform using FFTW and FastTransforms.jl Speed

erny123 · December 10, 2023, 9:13pm

I’m looking into spectral methods for some PDEs.

For the discrete Chebyshev Transform, it is basically a discrete cosine transform:

I’ve looked at FastTransforms.jl as they have a Chebyshev Transform which can be used with FastQuadrature.jl for many spectral methods. However, the FastQuadrature.jl doesn’t have Gauss-Lobatto points (which includes the end points of the domain). The Guass-lobatto points are also known as Chebyshev points of the Second Kind according to FastTransforms.jl. The regular Gauss points are Chebyshev of the First Kind.

The FastTransforms.jl is extremely slow in comparison to doing one’s own FFTW Chebyshev Transform. However, for the second kind Gauss-Lobatto points, the transforms seem to differ by a little bit and I’m not sure why?

Here’s what I got:

using FFTW
using FastTransforms.jl

#1st kind Gauss Points
#Gauss Points make Chebyshev Transform into a DCT Type-II
function FastChebyshevTransform1(g::Array{Float64}, plan)
    n = length(g)
    ftran = plan*g #FFTW with plan DCT Type-II 
    @. ftran = sqrt(2.0/n)*ftran; #normalization constant
    ftran[1] = ftran[1]/sqrt(2.0) #Normalization from definition for 1st element
    return ftran

end

#2nd kind gauss lobatto points
#Gauss-Lobatto points change Chebyshev Transform to DCT Type-I
function FastChebyshevTransform2(g::Array{Float64}, plan)
    ftran  = plan*g  #FFTW with plan DCT Type-I
    @. ftran = 1.0/n* ftran; #normalization constant
    ftran[1] = ftran[1]/2.0 #Normalization from definition for 1st element
    ftran[end] = ftran[end]/2.0 #Normalization from definition for last element
    return ftran

end

To test use function exp().

Gauss Points 1st Kind

p = FastTransforms.chebyshevpoints(Float64,n, Val(1)) #1st Guass Points
g =  exp.(p)
pcosp = FFTW.plan_dct(g, flags=FFTW.PATIENT); # DCT Type-II
pr0 = FFTW.plan_r2r(f, FFTW.REDFT00,flags=FFTW.MEASURE); #DCT Type-I

For the FastTransform.jl

@btime FastTransforms.chebyshevtransform(g, Val(1));

Output:
9.376 μs (23 allocations: 1.09 KiB)

And FFTW:
@btime FastChebyshevTransform1(g, pcosp);
Output:
605.680 ns (7 allocations: 464 bytes)

For the Gauss-Lobatto 2nd Kind:

p = FastTransforms.chebyshevpoints(Float64,n, Val(2)) #1st Guass Points
g =  exp.(p)

FastTransforms:
@btime chebyshevtransform(g, Val(2));
Output:
125.223 μs (168 allocations: 11.05 KiB)

FFTW:
@btime FastChebyshevTransform2(g, pr0);
Output:
16.896 μs (133 allocations: 9.73 KiB)

using FFTW seems to be much faster.

The 1st kind Gauss Points for both functions are identical. However, the 2nd kind Gauss-Lobatto Points Transforms are not exactly the same as shown below:

FastTransforms:

FastTransforms.chebyshevtransform(g, Val(2))
20-element Vector{Float64}:
  1.2660658777520084
  1.1303182079849698
  0.2714953395340765
  0.044336849848663804
  0.00547424044209372
  0.0005429263119139379
  4.497732295426474e-5
  3.1984364623956356e-6
  1.992124806361816e-7
  1.1036771676242671e-8
  5.505895624045487e-10
  2.4979566338996256e-11
  1.039185688854928e-12
  3.994426203361931e-14
  1.429092589880367e-15
  3.7981314000334304e-17
  1.0225738384705389e-16
  1.4023869784738819e-16
  9.349246523159212e-17
 -9.349246523159212e-17

FFTW function:

FastChebyshevTransform2(g, pr0)
20-element Vector{Float64}:
  1.2027625838644083
  1.0738022975857213
  0.2579205725573727
  0.042120007356230615
  0.005200528419989034
  0.000515779996318241
  4.2728456806551503e-5
  3.038514639275854e-6
  1.8925185660437253e-7
  1.0484933092430537e-8
  5.230600842843212e-10
  2.3730588022046446e-11
  9.872264044121816e-13
  3.794704893193834e-14
  1.3576379603863487e-15
  3.608224830031759e-17
  9.71445146547012e-17
  1.332267629550188e-16
  8.881784197001253e-17
 -8.881784197001253e-17

So not sure where the issue is between the two.

EDIT: Solution for the discrepancies in the Chebyshev Transforms with Gauss-Lobatto 2nd Kind is normalization factor which can be seen in the corrected function below:

#2nd kind gauss lobatto points
#Gauss-Lobatto points change Chebyshev Transform to DCT Type-I
function FastChebyshevTransform2(g::Array{Float64}, plan)
    ftran  = plan*g  #FFTW with plan DCT Type-I
    @. ftran = 1.0/(n-1.0)* ftran; #normalization constant
    ftran[1] = ftran[1]/2.0 #Normalization from definition for 1st element
    ftran[end] = ftran[end]/2.0 #Normalization from definition for last element
    return ftran

end

stevengj · December 10, 2023, 9:17pm

Usually it’s a matter of looking carefully at FFTW’s definition of the transform and comparing the normalization to what you want (it may differ in the first and last points).

See, for example, how FastChebInterp.jl computes the first-kind Chebyshev coefficients.

erny123 · December 10, 2023, 9:23pm

@stevengj Ah you’re right!
The normalization for DCT Type-I is:
1/(n-1)

I also didn’t know of FastChebInterp.jl. I see that it’s also using FFTW so I’m guessing it would be just as fast as the functions I wrote?

erny123 · December 10, 2023, 9:38pm

Tried FastChebInterp.jl

@btime FastChebInterp.chebcoefs(g);
Output:
120.330 μs (131 allocations: 10.02 KiB)

Which is similar to FastTransforms.jl

stevengj · December 10, 2023, 9:43pm

If you need to compute interpolants over and over, you can do better by precomputing the plan, preallocating, etc. (Also, FastChebInterp is only type I.)

It’s probably just the overhead of not precomputing the plan, unlike your code. I guess FastTransforms could use an API that precomputes the transform data, along with a mul! API to apply it in-place to precomputed output.

FastChebInterp is mainly designed for interpolation, in which case you usually compute the interpolant once and then re-use it many times, so a bit of (constant-factor) overhead in creating the interpolant is not a big concern.

jishnub · December 13, 2023, 11:14am

FastTransforms recomputes the plan for each transform if you call chebyshevtransform(g, Val(2)) directly. To compare the runtimes, you should precompute a Chebyshev plan and only measure the time taken by the transform, which should be comparable. Computing the plan can be expensive.

erny123 · December 16, 2023, 6:18pm

@jishnub I don’t see any documentation in FastTransforms.jl on pre-computing plans for transforms like the Chebyshev. How would one go about doing that?

Topic		Replies	Views
Error in FastTransforms.jl for cheb2leg General Usage fftw , math , fft	1	264	January 22, 2024
Compute fourrier transform in Julia Numerics question	21	3725	June 5, 2023
Replacing FFT from Numerical Recipes with FFTW Numerics question , fftw	8	1603	August 11, 2024
Fast Legendre Transform? Numerics question , package	3	769	November 6, 2021
Numerical computation of continuous Fourier transforms Numerics question , package , integral	3	1642	March 17, 2022

Chebyshev Transform using FFTW and FastTransforms.jl Speed

Related topics