 # Differentiation of incomplete Beta function

First, to give some context: my use case is equivalent to the following contrived example, in which I would like to define a (`Turing.jl`) probabilistic program that uses a discretized Beta mixture model.

``````@model themodel(X, K) = begin
a ~ Exponential()
b ~ Exponential()
p = discretize(Beta(a, b), K)
X ~ MixtureModel([Bernouilli(p[i]) for i=1:K])
end

function discretize(d, K)
qstart = 1.0/2K
qend = 1. - 1.0/2K
xs = quantile.(d, qstart:(1/K):qend)
xs *= mean(d)*K/sum(xs)  # rescale by factor mean(d)/mean(xs)
end
``````

Now, this does not work, because currently the quantile function for the Beta distribution (which is the inverse of the incomplete Beta function, or `beta_inc_inv` in `SpecialFunctions.jl`) does not work with AD. (This brings me to a first question, why do many methods in `SpecialFunctions` only take `Float64` type args?). I experimented (lazily) with adapting the arg types for the relevant `SpecialFunctions` methods and using AD to get my gradient of interest, but this gets too slow.

Then I’ve implemented an algorithm for the gradient of the CDF of the beta distribution (i.e. `beta_inc`) with respect to the `a` and `b` parameters (which I found here). This gives me the gradient of `beta_inc(a, b, x)` w.r.t. `a` and `b` fast. I would expect that this would enable faster computations of the `beta_inc_inv` gradient (which mainly calls `beta_inc` in a numerical inversion routine if I understand correctly).

However, now I’m not sure how to proceed (I am no analysis guru, and all the stuff in `JuliaDiff` is a bit daunting). It seems to me I should implement some diffrule for `beta_inc`? Say my function is `beta_inc_grad(a,b,x)` and returns a tuple with `beta_inc(a,b,x)` and its partial derivatives w.r.t. `a` and `b`, is

``````@define_diffrule SpecialFunctions.beta_inc(a, b, x) =
:( beta_inc_grad(\$a, \$b, \$x) ), :( beta_inc_grad(\$a, \$b, \$x) ), :NaN
``````

what I should implement to make this work with AD (it doesn’t seem to be enough to make `beta_inc` play with ForwardDiff)? (Also, would this call `beta_inc_grad` twice?)

2 Likes

Would you be willing to share your `beta_inc_grad` code?

Hi Benjamin, I haven’t followed up on this, but I put what I had implemented back when I asked this question here GitHub - arzwa/IncBetaDer. The `beta_inc_grad` code is here, I hope it can be useful. I’m not at all a numerical algorithms guy, so it might not be very good or correct currently, although I do think I had some tests. (Needless to say, I haven’t prepared this code for release or so, so comments etc. should be read as notes to myself).

I’m not a numerical algo guy either, but I also need these derivatives (in my case because I want to use TDistribution for some Bayesian stuff in Turing, but it doesn’t work with AD because of the `beta_inc`).

1 Like