Approximating a Numerically Computed Distribution

marcpabst · March 16, 2022, 2:10pm

I have a very simple, numerically computed posterior (note that I’m not randomly sampling from the posterior but actually calculate it’s (log-)densities over a given interval) that I know converges to a Normal distribution for high sample sizes. For a somewhat simpler case, I actually have a closed-form approximation, but I now want to generalize my solution (and also possible check any other closed-form approximations).

So I was wondering if there is an elegant way to approximate a pdf, e.g. using the Distributions.jl package? My naive idea would be to use ApproxFun.jl but I was kinda hoping for a more straightforward solution. I could also sample from my posterior and then use fit(Normal, ...) but that can’t be the way to do this…

tbeason · March 16, 2022, 3:10pm

Do you mean that you have computed it over a grid of values? In that case, maybe what you want is to interpolate between grid points?

marcpabst · March 16, 2022, 3:17pm

I’m actually dealing with a dimension case here, so I’m not sure if a “grid” is a fair description, but yeah, that is basically the idea. But I actually want to fit a Gaussian, i.e. estimating the two parameters μ and σ.

JeffreySarnoff · March 16, 2022, 7:32pm

take a look at Distributions.jl/stable/fit and
how to fit a normal distribution

marcpabst · March 17, 2022, 1:24pm

But this assumes that I have observations sampled from some distribution, doesn’t it? Actually, I have a numerically computed pdf over a grid / number of evenly spaced points.

Of course, I can just sample from my pdf and then take the data’s moments (mean, variance) or use fit() to get my normal distribution. It just seems a bit unnecessary to go through this… I guess I was justing hoping for a more elegant way to do this.

(I usually would just go with MCMC and then just just fit() on the samples, but for the simple 1-d case, Turing is order of magnitude slower than just computing the posterior over a grid)

tbeason · March 17, 2022, 2:13pm

I still don’t quite understand what you want. My latest interpretation is that you have computed some pdf values over a (1D) grid, now you want to compute the mean and standard deviation of that pdf. Is that right? Depending on how much of the support you have covered with your grid points, you could interpolate to create a “filled in” pdf and then just compute the mean and standard deviation using QuadGK with the integral formulas of those statistics. This is much more general than fitting a Gaussian.

Alternatively, if you for some reason don’t have enough points for that, you could do a least squares fit to the closest Normal (or some other) distribution. You could minimize the distance between your pdf(x) values and pdf(Normal(m,s),x) where you are optimizing over m and s. There are a couple of loss functions you can use to do this.

Topic		Replies	Views
Using Numerical Integration and ApproxFun in Turing.jl Probabilistic programming turing , quadgk , approxfun	12	670	July 21, 2023
How to approximate a distribution function from an arbitrary list? General Usage question , package	5	424	November 18, 2022
[ANN] NumericalDistributions.jl: user-defined distributions Statistics package , announcement	1	251	April 13, 2025
Estimate most probable value from data Statistics question	7	2968	June 19, 2019
Parameter Fitting in Turing Probabilistic programming turing	15	1429	October 12, 2022

Approximating a Numerically Computed Distribution

Related topics