I need to use the cdf (and quantile) function of the T distribution inside a function that I want to use Automatic Differentiation on. This works fine with almost any univariate distribution from the Distributions.jl package, but unfortunately not with the T distribution. Here is an MWE:
using ForwardDiff, ReverseDiff
using Distributions
function mytargetfunction(data::AbstractVector)
function obtaingradient(ν::AbstractVector{R}) where {R<:Real}
dist = Distributions.TDist(ν[1])
data_uniform = cdf.(dist, data)
return sum( logpdf(dist, data_uniform[i]) for i in eachindex(data_uniform) )
end
end
#working
ν = [3.0]
data = randn(1000)
target = mytargetfunction(data)
target(ν)
#not working
ForwardDiff.gradient(target, ν) #MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag
ReverseDiff.gradient(target, ν) #ArgumentError: Converting an instance of ReverseDiff.TrackedReal{Float64, Float64, ReverseDiff.TrackedArray{Float64, Float64, 1, Vector{Float64}, Vector{Float64}}} to Float64 is not defined. Please use `ReverseDiff.value` instead.
If I swap the T distribution in the closure with, e.g., a Normal, everything works fine. Are there any alternative packages for Julia that allow me to use a AutoDiff friendly T distribution? Or does anyone know a better trick to solve that problem? I can write down the cdf/quantile function analytically but only for a few selected cases unfortunately.
Perhaps using HypergeometricFunctions.jl would be acceptable? I doubt this is as fast or robust as some other options, but in easy tests it seems to be fine:
using StatsFuns, SpecialFunctions, HypergeometricFunctions, ForwardDiff, FiniteDifferences
# not thoughtfully written at all
function tcdf(v, x)
1/2 + x*gamma((v+1)/2)*_₂F₁(1/2, (v+1)/2, 3/2, -x*x/v)/(sqrt(pi*v)*gamma(v/2))
end
# CDF check:
@assert tcdf(1.1, 3.1) ≈ StatsFuns.tdistcdf(1.1, 3.1)
# Forward auto check:
@assert isapprox(ForwardDiff.derivative(_v->tcdf(_v, 3.1), 1.1),
central_fdm(10,1)(_v->tcdf(_v, 3.1), 1.1))
Obviously it’s preferable to have rules than to hope that ForwardDiff will pass through every branch of a function like that and give you the correct output—I had this fight with besselk a while ago, and it was an enormous pain to sort out…so I would guess the issue is even worse for a much more complicated function like 2F1.
For the quantile function, you could use a root finder on this CDF and ImplicitDifferentiation.jl to get the derivatives.
Again, though, it’s not obvious at all that a code implementation of 2F1 will give correct auto-derivatives of 2F1, so you should probably write a lot of tests that try to cover your uses cases and domain region and stuff.
I checked the manual implementation of the cdf with HypergeometricFunctions and it does work for my example! Unfortunately, it seems to segfault for quite a wide range of parameter values, making it difficult to work with.
I adjusted my expectations a bit and will try to define a custom Chainrules rule for my case to work.