Zygote gradient seems to return an odd result for a constant function. That is,
f1 = (x) → 0.0
gradient(f1,0) = (nothing,)
but
f2 = (x) → x^2
gradient(f2, 0) = (0,)
Why does the first one return “nothing” instead of 0, which is what I expected.
Zygote gradient seems to return an odd result for a constant function. That is,
f1 = (x) → 0.0
gradient(f1,0) = (nothing,)
but
f2 = (x) → x^2
gradient(f2, 0) = (0,)
Why does the first one return “nothing” instead of 0, which is what I expected.
Zygote uses nothing
as a kind of strong zero, since the derivative of f1
is always zero, while that of f2
depends on x
and is only zero for this particular value.
The reason to make this distinction is that if you give it f1(g(x,y))
, it knows never to bother working out the gradient of g
at all, as it will never be needed. But for f2(g(x,y))
, the gradient function which gets compiled must allow for g(x,y)
being nonzero.
Thank you.