(approximate) PDFs/CDFs suitable for automatic differentiation?

I’m working on a project where, given some observations of transformed random variables, we would like to estimate the parameters of the distribution those observations came from.

As a simple example, suppose Y = min(X_1, X_2) where X_i ~ Gamma(a,b). We see Y, and we’d like to estimate a and b via maximum likelihood estimation.

We can write down the [log]pdf of Y, but it involves the CDF of a gamma random variable. The gamma CDF available in the JuliaStats ecosystem is a Rmath call, and thus won’t accept DualNumbers as input. Therefore ForwardDiff cannot compute a derivative, and I expect the same would be true for any other differentiation package since the CDF is not computed in Julia.

Is there a pure Julia implementation of either CDFs of common distributions or the special functions needed which would support automatic differentiation? Or am I better off just fitting the maximum likelihood parameters via a derivative-free method?

Given that the derivative of the CDF is the PDF which is also available, you can implement a method that works on dual numbers.

That’s a great point for the example given. But unless I’m mistaken (quite possible), sometimes you do need the CDF. Modify the example so that Y = min(X_1, X_2, X_3) where the X_i are again iid. Now the pdf of Y has a F^2 (F the CDF of X) term in it, and the derivative of the likelihood still has a F term, right?

But if I understand correctly, you have both the CDF and the PDF, so you can still define a method for it. Also, if you just define one for F(::ForwardDiff.Dual), it will take care of F(...)^2 etc. Look at the manual of ForwardDiff.

On further reflection I must not understand what you mean here. Denote by f(a,x) the PDF with parameters a at point x, and likewise for F the CDF. Then f(a,x) is the derivative of F(a,x) with respect to x. But for this MLE problem, we need the derivative with respect to a, the parameters.

Am I missing something obvious?

Sure, but if I understand your problem correctly, the only tricky part of the CDF is the (incomplete) gamma function, which has recursively defined derivatives, so a method can be defined for ForwardDiff.Dual using existing building blocks.

I needed to differentiate a definite integral of a pdf of a gamma for my (in progress) package survival and I think I have a reasonable solution. First one needs to realize that:
gradient integral pdf(Gamma(a,b),x)dx = integral pdf(Gamma(a,b),x) gradlog(pdf(Gamma(a,b),x)dx

Then I’ve implemented an efficient way to compute integral f(x)pdf(dist,x)dx (for one dimensional functions, so you should maybe try one component of the graldog at a time). First you get a vector of coefficients you’ll need for the actual function and then you compute it. Let’s say dist is you distribution object and f your function (gradlog in your case):

Pkg.clone(https://github.com/piever/Survival.jl.git)

using Survival
const coefs = Survival.clenshaw_coefs(dist, f)
# value of indefinite integral at t is
indefinite_integral_at_t = Survival.clenshaw_asin(cdf(dist,t),coefs)

Once you have computed the coefficients, evaluating the indefinite_integral should be pretty fast (ideally not much slower than computing the cdf). There is a bit of a speed accuracy trade-off, handled by an optional argument. You cal also call

const coefs = Survival.clenshaw_coefs(dist, f, Val{N}())

where larger N gives more accuracy but takes longer. Default is 50. The code is here. Sorry for the scarce documentation, hope it helps!

1 Like