Distribution of sum of random variables

As @mschauer said, it would be helpful to get more information.

But in lieu of that, I’ll assume we’re talking about continuous univariate random variables being scaled by real scalars and added. First, Distributions.jl can handle the scaling with no problems, e.g.

julia> using Distributions

julia> Normal() * 5
LocationScale{Float64, Continuous, Normal{Float64}}(
μ: 0.0
σ: 5.0
ρ: Normal{Float64}(μ=0.0, σ=1.0)
)

Addition of random variables is more complicated. The distribution of the sum of two random variables is their convolution. There’s Distributions.convolve, and if you’re very lucky, the distributions you want to convolve have a corresponding method in https://github.com/JuliaStats/Distributions.jl/blob/v0.25.17/src/convolution.jl. Probably not though, since there are no methods for LocationScale objects.

So now a general approach using the Fourier transform. The Fourier transform of a PDF is called the characteristic function (CF). The CF of a convolution is the product of the CFs of the convolved PDFs.

That is, if f = g * h, where aX = A \sim g, b Y = B \sim h, and * indicates convolution, then C = A + B \sim f. Now \hat{g}(w) = \mathbb{E}_{A \sim g}[e^{i w A}] and \hat{h}(w) = \mathbb{E}_{B \sim h}[e^{i w B}] are the CFs of f and g. Then \hat{f}(w) = \hat{g}(w) \hat{h}(w) is the CF of f, and f(C) = \frac{1}{2\pi} \int_\mathbb{R} \hat{f}(w) e^{-i w C} \mathrm{d}w is the PDF of f.

So great, we exchanged one integral (marginalization) for 3! However, we can efficiently (in \mathcal{O}(n \log n)) approximate the PDF of h on a grid with n points by sampling both A and B on a grid, computing the discrete Fourier transform using FFT, multiplying the coefficients, and computing the inverse discrete Fourier transform.

Here’s a worked example:

using Distributions, FFTW, StatsPlots

function convolvedists(x, ds...)
	Δx = step(x)
	# FFT never sees x and instead assumes scale is unit range, so rescale the density accordingly
	scale = Δx .^ (length(ds) - 1)
	p = reduce(hcat, pdf.(d, x) for d in ds)
	p_hats = rfft(ifftshift(p, 1), 1)
	f_hat = vec(prod(p_hats; dims=2))
	return x, fftshift(irfft(f_hat, length(x))) .* scale
end

g = Gamma(1, 10)
h = Normal(-10, 1) * 4

# x must be symmetric in this implementation
x = -200:0.01:200

_, fpdf = convolvedists(x, g, h)
f_samples = rand(g, 100_000) .+ rand(h, 100_000)

plot(; size=(700, 500))
density!(f_samples; label="sample KDE", xlabel="x", ylabel="density")
plot!(x, fpdf; label="fft", xlim=(-60, 30))

For me, converting from the theory using DFT to the practice using FFT is a little non-intuitive, so I’m sure someone can point out improvements to my code.

11 Likes