Zygote gradient seems to return an odd result for a constant function. That is,
f1 = (x) -> 0.0
gradient(f1,0) = (nothing,)
f2 = (x) -> x^2
gradient(f2, 0) = (0,)

Why does the first one return “nothing” instead of 0, which is what I expected.

Zygote uses nothing as a kind of strong zero, since the derivative of f1 is always zero, while that of f2 depends on x and is only zero for this particular value.

The reason to make this distinction is that if you give it f1(g(x,y)), it knows never to bother working out the gradient of g at all, as it will never be needed. But for f2(g(x,y)), the gradient function which gets compiled must allow for g(x,y) being nonzero.

