XDiff.jl takes roots in ReverseDiffSource.jl, hence similar names and semantics. However, there are several important differences:
-
ReverseDiffSource can’t handle nested functions. While XDiff isn’t perfect in this sense too, it can handle all scalar and most element-wise and broadcasting functions. For example, there’s derivative only for
log(x)
(single argument) defined in the package, but if you callrdiff
onlog(b, x)
(two arguments), it will extract code for this method, infer a new differentiation rule for it and add to cache. In practice, this means that you don’t need to define your cost functions as a single long function but can easily decompose it into a number of smaller ones as you would do in normal code. -
XDiff supports differentiation of vector-valued functions. In the context of ML, it’s mostly valuable for analysis only since cost functions are normally scalar-valued. But I’ve already encountered a couple of use cases (e.g. in finance) where both - input and output of a function - are vectors.
-
The final step of differentiation is code generation. Currently, there are only 2 formats available - vectorized and Einstein notation - but the final goal is to add pluggable code generators. Imagine that with a single argument you can produce vectorized code or code with fused loops, or BLAS-optimized, or code for GPU, etc. Though, this is mostly part of Espresso.jl vision.