Is there any good way to check gradient calculated by Zygote.jl

Richard-Li · April 22, 2021, 7:08am

Hi there,
I’d like to know is there any good way to check the gradient of custom matrix functions calculated by Zygote.jl.

like this one:

function my_custom_matrix_func(m)
    return sum(m * m')
end

I know FiniteDifferences.jl, but I am wondering if it can do finite difference on custom matrix function and return the gradient.

Thanks for any reply.

e3c6 · April 22, 2021, 9:14am

It would be nice to have some functions which took care of everything automatically. … Something with the convenience of gradient/params but for FiniteDiff and which does the check automatically.

If this exists, I would like to know

marius311 · April 22, 2021, 6:14pm

Maybe I’m missing something but doesn’t FiniteDifferences already do this? E.g.:

julia> using FiniteDifferences

julia> grad(central_fdm(3,1), my_custom_matrix_func, [1. 2; 3 4])[1]
2×2 Matrix{Float64}:
 8.0  12.0
 8.0  12.0

burmecia · August 15, 2021, 11:59pm

Not sure if there any official ways to check Zygote gradient, but I usually check it by this:

gs = gradient(...)  # whatever gradient you get

# check parameter `p` and corresponding gradient `g` in gradient
for (p, g) in pairs(gs)
  @info(p)  # print out parameter
  @info(g)  # print out corresponding gradient
end

e3c6 · August 21, 2021, 12:43pm

That’s not checking the numerical values.

e3c6 · August 21, 2021, 12:43pm

It would be nice to have something that automatically works with the params interface.

burmecia · August 23, 2021, 11:21pm

Alright, maybe we’re talking about different things, I thought you were trying to check gradients by human eyes. And I just find another useful utility @showgrad, which can help debug gradients.

curtd · August 26, 2021, 4:34pm

If you’re just wanting to confirm your numerical gradients are performing correctly, IMO you should always be performing the gradient test. If you have your function f(x) : \mathbb{R}^n \mapsto \mathbb{R} evaluated at a random point x\in\mathbb{R}^n, given the (computed) gradient \nabla f(x) \in \mathbb{R}^n and a random direction \Delta \in \mathbb{R}^n, the following should hold

|f(x+h\Delta) - f(x)| = O(h) \\ |f(x+h\Delta) - f(x) - h \langle \nabla f(x), \Delta \rangle | = O(h^2)

In Julia code, for your matrix case, this would be something along the lines of

n = 1000
x = randn(n, n)
delta = randn(n,n)
f = my_custom_matrix_func
f0 = f(x)
gradf = # computed from Zygote, or wherever
df = dot(gradf, delta)
h = 10 .^ (-6.0:0.0)
err_zeroth_order = zeros(length(h))
err_first_order = zeros(length(h))
for (i,hi) in enumerate(h)
     f1 = f(x+hi*delta)
     err_zeroth_order[i] = abs(f1-f0)
     err_first_order[i] = abs(f1-f0-hi*df)
end
h0 = median(diff(log10.*(err_zeroth_order))) # Should be ~ 1
h1 = median(diff(log10.*(err_first_order))) # Should be ~2 if your gradient is computed correctly

The O(h^2) behaviour won’t exactly hold for very small h as numerical imprecision errors dominate the convergence error. Lots of edge case considerations for this one but I hope this gets the basic idea across. It’s a good test to add to a test suite!

Tamas_Papp · August 30, 2021, 8:58am

In 99% of cases you don’t want to implement your own FD code for testing, but use something robust like

https://github.com/JuliaDiff/FiniteDifferences.jl

with higher order algorithms and stepsize adaptation.

Topic		Replies	Views
Gradient check by zygote New to Julia zygote , gradient	6	301	January 31, 2024
"gradcheck" in Flux? Machine Learning	2	473	May 11, 2020
Compute gradient of gradient norm using zygote New to Julia zygote	17	1983	August 26, 2022
How to profile Zygote gradients Machine Learning	2	289	January 19, 2023
Discrepancy between complex gradients calculate with Zygote.jl and Python's Jax General Usage zygote	10	372	May 16, 2024

Is there any good way to check gradient calculated by Zygote.jl

Related topics