It would be nice to have some functions which took care of everything automatically. … Something with the convenience of gradient/params but for FiniteDiff and which does the check automatically.
Not sure if there any official ways to check Zygote gradient, but I usually check it by this:
gs = gradient(...) # whatever gradient you get
# check parameter `p` and corresponding gradient `g` in gradient
for (p, g) in pairs(gs)
@info(p) # print out parameter
@info(g) # print out corresponding gradient
end
Alright, maybe we’re talking about different things, I thought you were trying to check gradients by human eyes. And I just find another useful utility @showgrad, which can help debug gradients.
If you’re just wanting to confirm your numerical gradients are performing correctly, IMO you should always be performing the gradient test. If you have your function f(x) : \mathbb{R}^n \mapsto \mathbb{R} evaluated at a random point x\in\mathbb{R}^n, given the (computed) gradient \nabla f(x) \in \mathbb{R}^n and a random direction \Delta \in \mathbb{R}^n, the following should hold
In Julia code, for your matrix case, this would be something along the lines of
n = 1000
x = randn(n, n)
delta = randn(n,n)
f = my_custom_matrix_func
f0 = f(x)
gradf = # computed from Zygote, or wherever
df = dot(gradf, delta)
h = 10 .^ (-6.0:0.0)
err_zeroth_order = zeros(length(h))
err_first_order = zeros(length(h))
for (i,hi) in enumerate(h)
f1 = f(x+hi*delta)
err_zeroth_order[i] = abs(f1-f0)
err_first_order[i] = abs(f1-f0-hi*df)
end
h0 = median(diff(log10.*(err_zeroth_order))) # Should be ~ 1
h1 = median(diff(log10.*(err_first_order))) # Should be ~2 if your gradient is computed correctly
The O(h^2) behaviour won’t exactly hold for very small h as numerical imprecision errors dominate the convergence error. Lots of edge case considerations for this one but I hope this gets the basic idea across. It’s a good test to add to a test suite!