Mixed-mode automatic differentiation using ForwardDiff and ReverseDiff

Generally, if ReverseDiff (or Zygote) doesn’t handle a portion of a calculation (separated into some function), you should just use ChainRulesCore.jl to define a custom “pullback” (vector–Jacobian product) rule for that function, either with manual differentiation (typically by an adjoint method) or by using some other AD package (though forward-mode AD is not that efficient for pullbacks).

In general, reverse-mode differentiation (a.k.a. backpropagation or adjoint methods) is much faster than forward-mode when you are computing gradients (i.e. the derivative of one input output with respect to many outputs inputs). See also our matrix-calculus course notes.

5 Likes