Hey all. Just wanted to share a project that I am just finishing up with some collaborators: BesselK.jl. It is a package for evaluating the second-kind modified Bessel function \mathcal{K}_\nu(x) that has been specifically designed to have accurate ForwardDiff.jl
-based derivatives \tfrac{\partial}{\partial \nu} \mathcal{K}_\nu(x). While those derivatives do in some sense exist in closed form, the expressions for them are sufficiently challenging to compute (and would be sufficiently slow to compute) that they don’t really seem like viable options to us. So we have taken a different approach here.
In order to avoid naming conflicts with SpecialFunctions.besselk
, we export adbesselk
and adbesselkxv
, where adbesselkxv
gives x^\nu \mathcal{K}_\nu(x), a specific scaling that comes up in the Mat'ern covariance function, which was the primary motivation for this work. Here is a very unexciting demo:
using Besselk, ForwardDiff
(v,x) = (1.1, 1.1)
ForwardDiff.derivative(_v->adbesselk(_v, x), v) # accurate to at least atols of 1e-10
This was enough of a project that we ended up writing a paper about it. Without getting into too many details here, suffice it to say that a lot of expressions for \mathcal{K}_\nu come after limits have been taken, and for some involved reasons are not usable for implementations you want to pass AD through. So there was a lot of work to be done for when \nu + \tfrac{1}{2} \in \mathbb{Z} or \nu \in \mathbb{Z}, for example.
As an added bonus, if you’re okay with giving up the last couple trailing digits, derivatives with respect to argument also are great and are SO much faster than the the two calls to SpecialFunctions.besselk
that give the exact derivative. Here is a general summary of the package:
-
fast direct evaluations, sometimes at the cost of the last few digits: the exported function
adbesselk
actually givesSpecialFunctions.besselk
ifv isa AbstractFloat
, so by default you will always get the most accurate result. ButBesselK._besselk(v,x)
is significant faster thanSpecialFunctions.besselk(v,x)
almost everywhere in the domain, and is never slower. So if you’re willing to give up a couple digits, you can see some very significant speed gains. -
Accurate first and second derivatives with
ForwardDiff
: This is the whole point of the package, really. The entire implementation ofBesselk._besselk
, our direct implementation, was designed for this to work well. Higher derivatives, though, are not guaranteed to work, so use at your own risk. -
Fast derivatives: If you use adaptive finite difference derivatives like from the beautiful
FiniteDifferences.jl
you’ll get something comparably accurate, but it allocates and also takes 5+ function calls toSpecialFunctions.besselk
, which will mean more than an order of magnitude more runtime cost thanForwardDiff
+BesselK
, and for almost literally no accuracy gain in most cases. All derivatives withForwardDiff.jl
+BesselK.jl
are also non-allocating, so they will work a little bit better with many threads and stuff.
As a closer, here is an example implementation of the Matern covariance function that is now fully AD-compatible in all 3 arguments:
function matern(x, y, params)
(sg, rho, nu) = params
dist = norm(x-y)
iszero(dist) && return sg*sg
arg = sqrt(2*nu)*dist/rho
(sg*sg*(2^(1-nu))/gamma(nu))*adbesselkxv(nu, arg)
end
For plenty of people reading this who don’t particularly care about implementation details, the real point is that you can add this to your dependency tree and then forget about it and start fitting smoothness parameters in a way that is composable with whatever fancy method you use to fit Gaussian processes.
If you do use this project for something, though, please cite the paper. This was really a lot of work, and the reason it hasn’t been done already is because it is a huge pain. I know it seems silly to cite a paper that exported one little function that is sort of an afterthought in your dependency tree compared to whatever fancy Bayesian package you use to fit stuff or whatever, but it really took some R&D and grinding to actually put it together and make it so trivial to drop in and forget about.