Good question!
There are plenty of libraries out there, which can make it hard to wrap your head around it all. That’s why the page
was updated in January 2024 to reflect the current state of affairs. You can find a summary below, which I restricted to “automatic differentiation” in the most common sense – leaving finite differences and symbolic approaches aside.
Forward mode
Relevant when you have few inputs and many outputs, rather easy to implement, can handle a vast subset of Julia.
The main packages are ForwardDiff.jl (or PolyesterForwardDiff.jl for a multithreading speed boost) and Enzyme.jl.
Diffractor.jl is still experimental, and I would say not yet suited for general use?
Reverse mode
Relevant when you have few outputs and many inputs (typically in optimization), much harder to implement, can handle a narrower subset of Julia.
The main packages are Zygote.jl and Enzyme.jl:
- Deep learning (e.g. Flux.jl, Lux.jl) tends to use Zygote.jl for its good support of vectorized code and BLAS. Restrictions: no mutation allowed, scalar indexing is slow.
- Scientific machine learning (e.g. SciML) tends to use Enzyme.jl for its good support of mutation and scalar indexing. Restrictions: your code better be type-stable, and the entry cost is slightly higher (but the devs are extremely helpful, shoutout to @wsmoses).
So how do you choose?
Picking the right tool for the job is a tricky endeavor.
Inspired by a past unification attempt (AbstractDifferentiation.jl), @hill and I have been working hard on DifferentiationInterface.jl, which provides a common syntax for every possible AD backend (all 13 of them).
It is still in active development (expect registration next week), but it already has most of what you need to make an informed comparison between backends, notably thanks to the DifferentiationInterfaceTest.jl subpackage.
We’re eager for beta testers!