ChainRules: Replacing DiffRules in the Julia AD world

Perhaps DistributionalDerivative or GeneralizedDerivative would be clearer than ImpulseTrain. If I saw ImpulseTrain(f, x), I would probably be confused wondering why there is a function argument and how does this object relate to f and x.

Also, one thing I just started to work on is adding support for mutation in AD, and I’d be interested in any thinking you’ve done on handling that. For the most part you know when mutation is happening (e.g. setindex!, mul!), but there are also cases (e.g. getindex) where you want to be generic across array types but also take advantage of mutation where possible.

Of course, this might be a special-enough case that it can just be handled by individual AD frameworks, but it seemed worth raising.

4 Likes

I like the name GeneralizedDerivative. This would also imply that F is not necessarily piecewise constant.

One could have a syntax like

@weakrule(abs(x), sign(x))
@weakrule(sign(x), 0)

to signify that abs and sign only have derivatives in a weak sense.

Edit: Or, maybe

@rule(abs(x), @weak(sign(x)))
@rule(sign(x), @weak(0))

since functions might be differentiable for some arguments but not others.

Would it be possible to get some simple examples for calculating complex gradients for Functions f: \mathbb C^n \rightarrow \mathbb C via forward mode AD? I’m a bit lost at the moment and the above example doesn’t work for me with the most recent version. Any help would be much appreciated!