Distribution of sum of random variables

Hey Julia folks,

An equation I came across in my research requires evaluating the distribution of a linear combination of two random variables, aX + bY. Does the Distributions.jl package have a nice way to do this? More generally, is there a way to construct a distribution through operations involving random variables with known distributions?

This thread was almost helpful, but it looks like the solution in the end only works if you already know the distribution that results from the operations you’re performing (e.g., the sum of normal rv’s is normal).

An inelegant option (which I’m currently using) is to simulate aX + bY many times to get an empirical distribution. I was hoping Distributions (which is a great package!) might have some cleaner solution.

Thanks!

Can you be more precise about what you need?

1 Like

I have not used this, but it could be useful for you:
https://github.com/mmikhasenko/AlgebraPDF.jl

1 Like

As @mschauer said, it would be helpful to get more information.

But in lieu of that, I’ll assume we’re talking about continuous univariate random variables being scaled by real scalars and added. First, Distributions.jl can handle the scaling with no problems, e.g.

julia> using Distributions

julia> Normal() * 5
LocationScale{Float64, Continuous, Normal{Float64}}(
μ: 0.0
σ: 5.0
ρ: Normal{Float64}(μ=0.0, σ=1.0)
)

Addition of random variables is more complicated. The distribution of the sum of two random variables is their convolution. There’s Distributions.convolve, and if you’re very lucky, the distributions you want to convolve have a corresponding method in https://github.com/JuliaStats/Distributions.jl/blob/v0.25.17/src/convolution.jl. Probably not though, since there are no methods for LocationScale objects.

So now a general approach using the Fourier transform. The Fourier transform of a PDF is called the characteristic function (CF). The CF of a convolution is the product of the CFs of the convolved PDFs.

That is, if f = g * h, where aX = A \sim g, b Y = B \sim h, and * indicates convolution, then C = A + B \sim f. Now \hat{g}(w) = \mathbb{E}_{A \sim g}[e^{i w A}] and \hat{h}(w) = \mathbb{E}_{B \sim h}[e^{i w B}] are the CFs of f and g. Then \hat{f}(w) = \hat{g}(w) \hat{h}(w) is the CF of f, and f(C) = \frac{1}{2\pi} \int_\mathbb{R} \hat{f}(w) e^{-i w C} \mathrm{d}w is the PDF of f.

So great, we exchanged one integral (marginalization) for 3! However, we can efficiently (in \mathcal{O}(n \log n)) approximate the PDF of h on a grid with n points by sampling both A and B on a grid, computing the discrete Fourier transform using FFT, multiplying the coefficients, and computing the inverse discrete Fourier transform.

Here’s a worked example:

using Distributions, FFTW, StatsPlots

function convolvedists(x, ds...)
	Δx = step(x)
	# FFT never sees x and instead assumes scale is unit range, so rescale the density accordingly
	scale = Δx .^ (length(ds) - 1)
	p = reduce(hcat, pdf.(d, x) for d in ds)
	p_hats = rfft(ifftshift(p, 1), 1)
	f_hat = vec(prod(p_hats; dims=2))
	return x, fftshift(irfft(f_hat, length(x))) .* scale
end

g = Gamma(1, 10)
h = Normal(-10, 1) * 4

# x must be symmetric in this implementation
x = -200:0.01:200

_, fpdf = convolvedists(x, g, h)
f_samples = rand(g, 100_000) .+ rand(h, 100_000)

plot(; size=(700, 500))
density!(f_samples; label="sample KDE", xlabel="x", ylabel="density")
plot!(x, fpdf; label="fft", xlim=(-60, 30))

For me, converting from the theory using DFT to the practice using FFT is a little non-intuitive, so I’m sure someone can point out improvements to my code.

10 Likes

A propos, @sethaxen you might enjoy this gist https://gist.github.com/mschauer/d6a9814a7adf48131ad8f81f569a57f9 .

1 Like

Thanks so much for the helpful and quick replies.

To be more precise about what I need – the theorem I’m using requires evaluating P(W ≥ x), where W = aW1 + bW2; W1, W2 are independent standard Gumbel random variables; and a,b are positive real constants.

Distributions.jl’s mixture models almost does the job, but it seems like that’d be ideal if W = W1 with probability a or W2 with probability b (unless I’m misunderstanding).

@sethaxen using characteristic functions is pretty sleek! Thanks for the worked example, I think I can make that work. One very basic question though – I keep getting an error when I try “Normal() * 5” or similar. Is there an update or something about the syntax I’m missing?

No problem! I don’t know why Normal() * 5 would error. Can you share a minimal example, the result of Pkg.status(), and a stacktrace?

You know, this would make a really nice Pluto notebook with sliders. :wink:

Yes! Here’s the minimal example and stacktrace:

julia> using Distributions
julia> Normal() * 5
ERROR: MethodError: no method matching *(::Normal{Float64}, ::Int64)
Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...) at operators.jl:560
  *(::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:88
  *(::StridedArray{P, N} where N, ::Real) where P<:Dates.Period at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/deprecated.jl:44
  ...
Stacktrace:
 [1] top-level scope
   @ none:1

and here’s Pkg.status():

julia> Pkg.status()
      Status `~/.julia/environments/v1.6/Project.toml`
  [537997a7] AbstractPlotting v0.12.18
  [cbdf2221] AlgebraOfGraphics v0.2.3
  [c9ce4bd3] ArchGDAL v0.6.0
  [c52e3926] Atom v0.12.35
  [c0cd4b16] BAT v2.0.5
  [336ed68f] CSV v0.8.5
  [159f3aea] Cairo v1.0.5
  [13f3f980] CairoMakie v0.3.4
  [aaaa29a8] Clustering v0.14.2
  [35d6a980] ColorSchemes v3.15.0
  [5ae59095] Colors v0.12.8
  [d38c429a] Contour v0.5.7
  [a93c6f00] DataFrames v1.2.2
  [864edb3b] DataStructures v0.18.10
  [b4f34e82] Distances v0.10.4
  [31c24e10] Distributions v0.23.8
  [5789e2e9] FileIO v1.11.1
  [186bb1d3] Fontconfig v0.4.0
  [38e38edf] GLM v1.5.1
  [e9467ef8] GLMakie v0.1.13
  [5752ebe1] GMT v0.37.0
  [28b8d3ca] GR v0.59.0
  [c91e804a] Gadfly v1.3.3
  [4b11ee91] Gaston v1.0.4
  [9b6fcbb8] GeoData v0.4.6
  [09f84164] HypothesisTests v0.10.4
  [7073ff75] IJulia v1.23.2
  [6218d12a] ImageMagick v1.2.1
  [916415d5] Images v0.24.1
  [d0351b0e] InspectDR v0.4.3
  [a98d9a8b] Interpolations v0.12.10
  [e5e0dc1b] Juno v0.8.4
  [ee78f7c6] Makie v0.11.1
  [ff71e718] MixedModels v4.4.0
  [85f8d34a] NCDatasets v0.11.7
  [8314cec4] PGFPlotsX v1.4.1
  [61ac1d4c] Pingouin v0.2.3
  [58dd65bb] Plotly v0.4.0
  [91a5bcdd] Plots v1.22.3
  [e409e4f3] PoissonRandom v0.4.0
  [647866c9] PolygonOps v0.1.1
  [d330b81b] PyPlot v2.10.0
  [1fd47b50] QuadGK v2.4.2
  [6f49c342] RCall v0.13.12
  [f2b01f46] Roots v1.3.5
  [2913bbd2] StatsBase v0.33.10
  [65254759] StatsMakie v0.2.3
  [f3b207a7] StatsPlots v0.14.28
  [fd094767] Suppressor v0.2.0
  [b8865327] UnicodePlots v2.4.6
  [112f6efa] VegaLite v2.6.0
  [8ba89e20] Distributed
  [9a3f8284] Random
  [10745b16] Statistics

Looks like your version of Distributions is 2 minor versions old. Try updating to 0.25.

Yes, that fixed it. Thanks!

Apparently Distributions 0.25 is not compatible with Gadfly 1.3.3 (my favorite plotting package) and so I’ve been stuck at the older version. That’s a separate issue I’ll have to worry about later I guess…

2 Likes

It’s raining and the family is making chestnut-match-animals and Pluto notebooks https://mschauer.github.io/nonparbayes/ .

2 Likes

Thanks for sharing this nice code.

Small observation, all ifftshift’s and the fftshift may be eliminated for an odd number of random variables, or reduced to a single fftshift otherwise. This is due to the FFT circular convolution property.

function convolvedists(x, ds...)
	Δx = step(x)
	n = length(ds)
    scale = Δx  .^ (n - 1)
	p = reduce(hcat, pdf.(d, x) for d in ds)
    p_hats = rfft(p, 1)
	f_hat = vec(prod(p_hats; dims=2))
    yf = irfft(f_hat,  length(x)) .* scale
    iseven(n) ? (x, fftshift(yf)) : (x, yf)
end

Hi, thanks for the nice code. It was really helpful for me.

However, I want to use a convolution (say Gamma and Normal) as a fit function, i.e., I need to evaluate the convolution at specific points x (i.e. the points where I have data). How would I modify the code from @sethaxen to get the value of the convoluted pdf at a specific point instead of at the points of the given range? My attempts were not really successful.

I suppose an interpolation should work. Or even evaluating at the nearest grid point if the grid is dense enough.

Have a look at e.g. GitHub - JuliaMath/Interpolations.jl: Fast, continuous interpolation of discrete datasets in Julia

1 Like

Hi, thanks for this suggestion. That is actually something I already tried and it does work quite nicely indeed.
I just thought there might be a more direct way to get the value at a certain point.

You can invert the Fourier transform manually. I leave it up to you to determine whether it’s a good idea in practice. I’d think an interpolation is the more sensible way.

# borrowed from sethaxen above
# do not return ifft, but Fourier coefficients themselves.
function fftconvolvedists(x, ds...)
	Δx = step(x)
	# FFT never sees x and instead assumes scale is unit range, so rescale the density accordingly
	scale = Δx .^ (length(ds) - 1)
	p = reduce(hcat, pdf.(d, x) for d in ds)
	p_hats = fft(ifftshift(p, 1), 1)
	f_hat = vec(prod(p_hats; dims=2))
	return x, f_hat .* scale
end

function make_pdf(x, ds...)
   _, fpdf = fftconvolvedists(x, ds...)
   let x=x, fpdf=fpdf
    # step(x)/length(x) is just the length of the sampling interval
     return f(t) = real(sum( exp(2π*im*(m-1)*t/step(x)/length(x))*fpdf[m] for m in eachindex(fpdf)))/length(x)
   end
end

Summing over all components for every point is computationally quite expensive though. You can probably save some time by using rfft, but I found this easier to read (and write).

julia> fsample = rand(g, 100_000) + rand(h, 100_000);

julia> p = make_pdf(x, h, g)
(::var"#f#54"{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Vector{ComplexF64}}) (generic function with 1 method)

julia> xs = sort(rand(-60:0.01:25, 200));

julia> fs = p.(xs);

julia> density(fsample, label="")

julia> scatter!(xs, fs, ms=2)

pdf

2 Likes

Are you trying to deconvolve two pieces, eg a gamma and a normal one, from an empirical dataset ? If yes, there are tools using the cumulant generating functions that could help you :

  • Build the empirical cumulant generating function K(t) = ln(1/N Sum(e^(t*xi) for xi in the dataset) )

  • Fit K(t) to the function -alpha * ln (1-st) + mu *t + t^2 * sigma^2/2, e.g. via least square on a set of sampled values t_1…t_n in the negative half of the complex plane.

This should work.

Thanks a lot for the nice answer and code snipptes.
Indeed, it seems that the interpolation is computationally the better approach.