XDiff.jl takes roots in ReverseDiffSource.jl, hence similar names and semantics. However, there are several important differences:
-
ReverseDiffSource can’t handle nested functions. While XDiff isn’t perfect in this sense too, it can handle all scalar and most element-wise and broadcasting functions. For example, there’s derivative only for
log(x)(single argument) defined in the package, but if you callrdiffonlog(b, x)(two arguments), it will extract code for this method, infer a new differentiation rule for it and add to cache. In practice, this means that you don’t need to define your cost functions as a single long function but can easily decompose it into a number of smaller ones as you would do in normal code. -
XDiff supports differentiation of vector-valued functions. In the context of ML, it’s mostly valuable for analysis only since cost functions are normally scalar-valued. But I’ve already encountered a couple of use cases (e.g. in finance) where both - input and output of a function - are vectors.
-
The final step of differentiation is code generation. Currently, there are only 2 formats available - vectorized and Einstein notation - but the final goal is to add pluggable code generators. Imagine that with a single argument you can produce vectorized code or code with fused loops, or BLAS-optimized, or code for GPU, etc. Though, this is mostly part of Espresso.jl vision.