I’m looking into spectral methods for some PDEs.
For the discrete Chebyshev Transform, it is basically a discrete cosine transform:
I’ve looked at FastTransforms.jl as they have a Chebyshev Transform which can be used with FastQuadrature.jl for many spectral methods. However, the FastQuadrature.jl doesn’t have Gauss-Lobatto points (which includes the end points of the domain). The Guass-lobatto points are also known as Chebyshev points of the Second Kind according to FastTransforms.jl. The regular Gauss points are Chebyshev of the First Kind.
The FastTransforms.jl is extremely slow in comparison to doing one’s own FFTW Chebyshev Transform. However, for the second kind Gauss-Lobatto points, the transforms seem to differ by a little bit and I’m not sure why?
Here’s what I got:
using FFTW
using FastTransforms.jl
#1st kind Gauss Points
#Gauss Points make Chebyshev Transform into a DCT Type-II
function FastChebyshevTransform1(g::Array{Float64}, plan)
n = length(g)
ftran = plan*g #FFTW with plan DCT Type-II
@. ftran = sqrt(2.0/n)*ftran; #normalization constant
ftran[1] = ftran[1]/sqrt(2.0) #Normalization from definition for 1st element
return ftran
end
#2nd kind gauss lobatto points
#Gauss-Lobatto points change Chebyshev Transform to DCT Type-I
function FastChebyshevTransform2(g::Array{Float64}, plan)
ftran = plan*g #FFTW with plan DCT Type-I
@. ftran = 1.0/n* ftran; #normalization constant
ftran[1] = ftran[1]/2.0 #Normalization from definition for 1st element
ftran[end] = ftran[end]/2.0 #Normalization from definition for last element
return ftran
end
To test use function exp().
Gauss Points 1st Kind
p = FastTransforms.chebyshevpoints(Float64,n, Val(1)) #1st Guass Points
g = exp.(p)
pcosp = FFTW.plan_dct(g, flags=FFTW.PATIENT); # DCT Type-II
pr0 = FFTW.plan_r2r(f, FFTW.REDFT00,flags=FFTW.MEASURE); #DCT Type-I
For the FastTransform.jl
@btime FastTransforms.chebyshevtransform(g, Val(1));
Output:
9.376 μs (23 allocations: 1.09 KiB)
And FFTW:
@btime FastChebyshevTransform1(g, pcosp);
Output:
605.680 ns (7 allocations: 464 bytes)
For the Gauss-Lobatto 2nd Kind:
p = FastTransforms.chebyshevpoints(Float64,n, Val(2)) #1st Guass Points
g = exp.(p)
FastTransforms:
@btime chebyshevtransform(g, Val(2));
Output:
125.223 μs (168 allocations: 11.05 KiB)
FFTW:
@btime FastChebyshevTransform2(g, pr0);
Output:
16.896 μs (133 allocations: 9.73 KiB)
using FFTW seems to be much faster.
The 1st kind Gauss Points for both functions are identical. However, the 2nd kind Gauss-Lobatto Points Transforms are not exactly the same as shown below:
FastTransforms:
FastTransforms.chebyshevtransform(g, Val(2))
20-element Vector{Float64}:
1.2660658777520084
1.1303182079849698
0.2714953395340765
0.044336849848663804
0.00547424044209372
0.0005429263119139379
4.497732295426474e-5
3.1984364623956356e-6
1.992124806361816e-7
1.1036771676242671e-8
5.505895624045487e-10
2.4979566338996256e-11
1.039185688854928e-12
3.994426203361931e-14
1.429092589880367e-15
3.7981314000334304e-17
1.0225738384705389e-16
1.4023869784738819e-16
9.349246523159212e-17
-9.349246523159212e-17
FFTW function:
FastChebyshevTransform2(g, pr0)
20-element Vector{Float64}:
1.2027625838644083
1.0738022975857213
0.2579205725573727
0.042120007356230615
0.005200528419989034
0.000515779996318241
4.2728456806551503e-5
3.038514639275854e-6
1.8925185660437253e-7
1.0484933092430537e-8
5.230600842843212e-10
2.3730588022046446e-11
9.872264044121816e-13
3.794704893193834e-14
1.3576379603863487e-15
3.608224830031759e-17
9.71445146547012e-17
1.332267629550188e-16
8.881784197001253e-17
-8.881784197001253e-17
So not sure where the issue is between the two.
EDIT: Solution for the discrepancies in the Chebyshev Transforms with Gauss-Lobatto 2nd Kind is normalization factor which can be seen in the corrected function below:
#2nd kind gauss lobatto points
#Gauss-Lobatto points change Chebyshev Transform to DCT Type-I
function FastChebyshevTransform2(g::Array{Float64}, plan)
ftran = plan*g #FFTW with plan DCT Type-I
@. ftran = 1.0/(n-1.0)* ftran; #normalization constant
ftran[1] = ftran[1]/2.0 #Normalization from definition for 1st element
ftran[end] = ftran[end]/2.0 #Normalization from definition for last element
return ftran
end