Do any of the auto differentiation packages support specifying known derivatives by hand?

Some of the functions I’m trying to do AD over involve integrals, and rather than let the AD fight through my quadrature functions, I want to just use the fundamental theorem of calculus to tell the AD that, essentially,

d/dx quadgk(f, xmin, x) = f

as this shows up in several places throughout the derivation. Is this type of thing supported in any of the AD packages?

Sure, most, if not all, reverse mode AD tools allows this. In Flux its done with a macro @grad if I’m not mistaken.
http://fluxml.ai/Flux.jl/stable/internals/tracker.html#Custom-Gradients-1

1 Like

Wow, that was awesomely easy. I think I did it right, although I’m only just using Flux for the first time:

using QuadGK
using Flux
using Flux.Tracker: TrackedReal, @grad, data, track, gradient

myquad(f, a,              b)              = quadgk(f,a,b)[1]
myquad(f, a::TrackedReal, b::TrackedReal) = track(myquad, f, a, b)
myquad(f, a             , b::TrackedReal) = track(myquad, f, a, b)
myquad(f, a::TrackedReal, b             ) = track(myquad, f, a, b)
@grad myquad(f, a, b) = myquad(f, data(a), data(b)), Δ -> (nothing, -Δ*f(a), Δ*f(b))

f(x) = 2*myquad(x->x^2,x,0)
f′(x) = gradient(f, x)[1]
f′′(x) = gradient(f′, x)[1]

f(1) # -0.6666666666666666
f′(1.) # -2.0 (tracked)
f′′(1.) # -4.0 (tracked)

I wonder whether type stability is possible, but I should probably just go read the Flux docs…

I am curious if the other tools have something like this, I was not able to find it in ForwardDiff.jl/ReverseDiff.jl.

1 Like

I started something like this in https://github.com/JuliaDiff/ForwardDiff.jl/pull/165 but it got a bit stale.

2 Likes

This should be dead easy with ForwardDiff too. You essentially need to do this:

f(x) = x^3

using DualNumbers
f(x::Dual) = dual(f(realpart(x)), df(dualpart(x)))
df(x) = 3x^2

f(dual(1,1))

but IIRC it’s just a little more tricky as ForwardDiff allows for propagating multiple ɛs at once.

With Autograd it’s a macro a bit like Flux’s @grad:

@primitive f(x),dy,y   dy .* fgrad(value(x1)) 

It is indeed quite easy for derivatives, but for Jacobians and gradients, you need to do a bit more work (to propagate the partials correctly).