Hello, I try to implemente a straight forwared discrete Fourier transform abut I do not see what variables are allocated in the intermediate steps. For my undertanding it is a generator expression and should get the result without allocating anything?
julia> function dft(A::AbstractArray{T,N}, CT::Type = Float64) where {T,N}
F = similar(A, Complex{CT})
@inbounds for k ∈ CartesianIndices(F)
F[k] = sum(exp(- im * 2π * sum((k[d] - 1) * (n[d] - 1) / size(A, d) for d ∈ 1:N)) * A[n] for n ∈ CartesianIndices(A))
end
return F
end
dft (generic function with 3 methods)
julia> A = rand(Float64, 20, 20);
julia> using BenchmarkTools
julia> @btime dft($A);
5.331 ms (2401 allocations: 100.13 KiB)
I didn’t quite track it down, but I suspect that the compiler has a hard time doing type-inference through those nested generator expressions?
Here is a variant that does no allocations (other than allocating F), and uses the element type of A to determine the element type of the output F:
function dft(A::AbstractArray{T,N}) where {T,N}
F = zeros(complex(float(T)), size(A)...)
@inbounds for k ∈ CartesianIndices(F), n ∈ CartesianIndices(A)
F[k] += cis(-2 * real(eltype(F))(π) * sum(d -> (k[d] - 1) * (n[d] - 1) / real(eltype(F))(size(A, d)), ntuple(identity, Val{N}()))) * A[n]
end
return F
end
Note that the 2π expression in your exponent will be computed in double precision by default, even if you want some other precision for F. To force it into the desired precision, use real(eltype(F))(π); you have to do the same thing for the division by size(A,n) since otherwise division of two integers will default to double precision. You can also use cis(x) instead of exp(im*x). I also used ntuple(identity, Val{N}()) instead of 1:N, which allows the compiler to unroll the inner loop by supplying the length at compile time.
Ok, but unfortunately it does not come up with no allocations when using BigFloat… Than it is a lot slower then the original code…
julia> function dft(A::AbstractArray{T,N}, CT::Type = Float64) where {T,N}
F = similar(A, Complex{CT})
PI = real(eltype(F))(π)
@inbounds for k ∈ CartesianIndices(F)
F[k] = sum(cis(-2 * PI * sum((k[d] - 1) * (n[d] - 1) / size(A, d) for d ∈ 1:N)::Float64) * A[n] for n ∈ CartesianIndices(A))
end
return F
end
dft (generic function with 2 methods)
julia> function dft_new(A::AbstractArray{T,N}) where {T,N}
F = zeros(complex(float(T)), size(A)...)
@inbounds for k ∈ CartesianIndices(F), n ∈ CartesianIndices(A)
F[k] += cis(-2 * real(eltype(F))(π) * sum(d -> (k[d] - 1) * (n[d] - 1) / real(eltype(F))(size(A, d)), ntuple(identity, Val{N}()))) * A[n]
end
return F
end
dft_new (generic function with 1 method)
julia> using BenchmarkTools
julia> A = rand(Float64, 20, 20);
julia> @btime dft($A);
4.997 ms (4801 allocations: 218.88 KiB)
julia> @btime dft_new($A);
3.416 ms (1 allocation: 6.38 KiB)
julia> B = rand(BigFloat, 20, 20);
julia> @btime dft($B, BigFloat);
1.706 s (6058760 allocations: 240.00 MiB)
julia> @btime dft_new($B);
1.807 s (7976768 allocations: 342.43 MiB)
Basically, I would like to perform a test on accuracy of the Float64 fft results and I wanted to have the most simple solution. I do not have the time to implement a fft on my own.
Probably I need DoubleFloat or Float128 instead of BigFloat…