Precise use of FFTW

How can I call FFTW to map rows of an m x (2n+1) matrix whose column basis is:

{ 1, sinθ, cosθ, sin2θ, cos2θ,…, sin nθ, cos nθ }

to values of the sum (over the matrix entries in the row times the column basis) at equispaced points in θ? Preferably operating allocation-free.

A solution is available in FastTransforms.jl commit 67a6b, though I won’t claim it’s optimal in any way beyond meeting the above criteria.