Zygote gradient seems to return an odd result for a constant function. That is,

f1 = (x) -> 0.0

gradient(f1,0) = (nothing,)

but

f2 = (x) -> x^2

gradient(f2, 0) = (0,)

Why does the first one return “nothing” instead of 0, which is what I expected.

Zygote uses `nothing`

as a kind of strong zero, since the derivative of `f1`

is always zero, while that of `f2`

depends on `x`

and is only zero for this particular value.

The reason to make this distinction is that if you give it `f1(g(x,y))`

, it knows never to bother working out the gradient of `g`

at all, as it will never be needed. But for `f2(g(x,y))`

, the gradient function which gets compiled must allow for `g(x,y)`

being nonzero.

2 Likes