Characteristic Functions --> pdf using fft?

I have to work on a problem that requires me to use characteristic functions to represent random variables, as these particular rv’s have no close form distribution. The last time I did something remotely similar was at least 4 years ago and it was on matlab, so my recollection of this is quite fuzzy. I am having particular trouble understanding the convention of the Fourier transform used and how I might need to take it into account.

First off, in the documentation at API · AbstractFFTs.jl, the convention used for the transform/inverse transform seems to indicate that the C.F. is supposed to be calculated with ifft, so to use fft to get the pdf from C.F. This is not consistent with the arguments being real valued arrays or complex arrays, so in my test below I will pretend I didn’t read these docs and will use ifft to go from C.F. to pdf, as my initial intuition would be.

Here’s a short attempt using the C.F. of the Gamma distribution (Gamma distribution - Wikipedia). I chose this one because it’s easy and short xto write.

using FFTW

n_grid = 2^13    # total number of grid points
N = n_grid / 2   # support for R- and R+
x_width = 2^3

Δx = x_width / n_grid    # in original space
x_grid = Δx .* vec(-N:1:(N-1))  # in original space

Δxi = 2*π / x_width        # in fourier space
xi_grid = Δxi .* vec(-N:1:(N-1))  # in fourier space


k = 9.0
θ = 0.5
# CF of the Gamma distribution
Φ(u) = (1 - 1im * u * θ) ^ (-k)

Φ_ongrid = map(Φ, xi_grid)

test_pdf = real(fftshift(ifft(Φ_ongrid)))
plot(x_grid, test_pdf)

I get some oscillating weird thing. I vaguely remember something about having to multiply the C.F. by its complex conjugate and something about the Plancherel theorem, but I can’t seem to find any references about it at the moment. Anyone used to doing this?

I went searching around but this was the only post that I found that was somewhat related, but since the author does the transformation and then the inverse, it doesn’t answer my questions, I believe
related post:

1 Like

I would recommend that you try to reproduce in Julia, this Matlab code from an expert in the field.

The transposition of the matlab code would look like that :

using FFTW, Distributions, Plots

function build_pdf(chf; xmin=-10,xmax=10,N=2^8)

    rng = xmax - xmin
    dx = rng/N
    k     = 0:(N-1)
    w     =  (0.5 - N/2 .+ k) * ( 2π / rng)
    
    ϕ = chf.(w[Int.((N/2+1)):end])
    ϕ = vcat(conj.(ϕ[end:-1:1]),ϕ)
    
    C   = (-1+0im).^((1-1/N)*(xmin/dx .+ k))/rng
    D   = (-1+0im).^(-2*(xmin/rng) .* k)
    return xmin .+ k * dx, real(C .* fft(D .* ϕ))
end


X = Gamma(9.0,0.5)
x, f = build_pdf(t -> cf(X,t); xmin = -10, xmax = 10,N = 2^8)
plot(x,f,label="reconstructed")
plot!(x, pdf.(X,x),label="true")

Which works. However more complicated characteristic functions do not work very well, as this code shows (a simple convolution of 4 gamma distributions)

Xs = (Gamma(rand(),rand()) for i in 1:4)
x, f = build_pdf(t -> prod(cf(X,t) for X in Xs); xmin = -10, xmax = 10,N = 2^8)
histogram(sum(rand(X,1000) for X in Xs),normalize=:pdf,bins=100)
plot!(x,f,label="reconstructed")
2 Likes

I had seen some books vaguely alluding to using quadratures and fft’ing the bins as well, but I assumed those were the author’s specific techniques for the application they were discussing.
Many thanks @lrnv , I was also in the process on doing that and reading the author’s own referenced paper, to understand the reasoning a bit better.

The x_min and x_max values seem crucial, as the types of distributions that require this kind of technique are most often than not fat tailed, and truncating the pdf doesn’t seem trivial to me. Additionally, maybe we can try to brute force for more precision by increasing the sampling points (I was initially trying with 2^13, for example).

I’ll try to play around with those parameters a bit to see if there’s any improvement as well.

The problem here seems to be to find a way of automatically calibrating those. The author suggest a max width of 6 sigma, but some of these distributions don’t even have defined 2nd moment.

Yes increasing the sampling points would help, but not that much for more complex distributions (such as the simple convolution of gammas i proposed).

X_min and x_max are just the zone of evaluation of the pdf : they would not really change the results, but of course if your distribution is fat_tailed, you might want x_max to be really big.

Another direction is to use a better integration technique than fft, but it of course depends on what you are trying to do. If you know the characteristic function (i mean, analytically), you might try Broomwitch integration directly, which usually gives better results with the right tools.

Another idea would be to concentrate on what you really want : why the pdf ? If the pdf is a step towards something else (e.g., maybe you want to sample the distribution ?) then there might be a better way of doing it than precomputing the pdf.

Yep, in that convolution case it does fail absolutely spectacularly, even with increased sampling.

In this case, I am working with (asymmetrical) tempered alpha stable distributions, which do have an analytical closed form for the characteristic function; the goal is to fit the tail tempering parameters. I am contemplating a few methods but they’re basically numerically optimizing something, like the MLE, ADS-test-statistics or QQplot L2 loss, and all of them involve either the pdf or the cdf.
To give an idea of references, Ranchev, Kim, Bianchi (2011) mention in section 6.2.1 using this type of integration to obtain the pdf and cdf. Other references refer to this one when vaguely mentioning parameter calibration techniques.

Regarding sampling, I have that semi covered, but I would like to avoid brute forcing this with Monte Carlo; trying to find an easier way first, the FFT approach seemed to be easier at first…

I had never heard of Broomwitch integration, I will check it, thanks for the tip.
EDIT: Ah, it’s the inverse laplace transform mentioned in the famous Arfen book!

Could you share the characteristic function ? I might have a trick that could help you.

Of course, it’s the characteristic function for the (classical) tempered alpha stable distribution; here it goes:

\Phi_{X}(u) = e^{\Psi_{X}(u)}

Where the characteristic exponent is given by

\Psi_{X}(u) = iu\Gamma(1-\alpha)[c_{+}\theta_{+}^{\alpha -1} - c_{-}\theta_{-}^{\alpha -1}] + \Gamma (-\alpha) \{c_{+}[(\theta_{+}-iu)^{\alpha} - \theta_{+}^{\alpha}] + c_{-}[(\theta_{-}+iu)^{\alpha} - \theta_{-}^{\alpha}]\}

It’s also mentioned here, in a rearranged form, just found it by a quick search: Financial models with long-tailed distributions and volatility clustering - Wikipedia.

This makes a new distribution for each t. Try making Xs a vector, to materialize the distributions once.

Although your example is actually fairly benign, your point holds - constructing distributions this way does sometimes require fancy quadrature schemes (and high-precision arithmetic).

3 Likes

Indeed, making it a vector is far superior as the distribution is not randomly picked at each t anymore:

Xs = [Gamma(10*rand(),rand()) for i in 1:10]
x, f = build_pdf(t -> prod(cf(X,t) for X in Xs); xmin = 0, xmax = 30,N = 2^12)
histogram(sum(rand(X,1000) for X in Xs),normalize=:pdf,bins=100)
plot!(x,f,label="reconstructed")

Which is behaving a lot more correctly :slight_smile:

2 Likes

Thanks @Ralph_Smith for the remark.
I will try applying this to my particular problem. I will also compare the quadrature methods I see mentioned by authors and see if I can somehow use the ones in QuadGK. I’ll see if I can also report some findings here.

So, I was reading the source suggested by @rafael.guerra a bit, and for my purposes, the initial author’s methodology seems to be useful, so I wrote a minimal interpretation of it.

For anyone who might find this useful: https://gitlab.com/tom.plaa/characteristicinvfourier.jl

I only implemented the FFT methodology, the author also applies some quadrature methods, but I will only try to apply those myself if the need arises.

2 Likes