Second derivative in Zygote returns nothing

jarroyoe · May 30, 2023, 6:23pm

I need to optimize a function that depends on the gradient of another function. When I run the following code:

using Zygote

data = rand(1,10)
f(p,x) = [x*p[1];x*p[2].+p[3]]
truef(x) = [x.+1;2*x]

function loss(p)
        dndts = [truef(x[1]) for x in data]
        dfdt(x) = Zygote.forwarddiff(x->f(p,x),x)

        ℓ = sum(abs2,reduce(hcat,[dfdt(data[i]).-dndts[i] for i in 1:10]))
        return ℓ
end

println(Zygote.gradient(loss,rand(3)))

I get (nothing,). If I take the gradient inside the function out, something like this:

using Zygote

data = rand(1,10)
f(p,x) = [x*p[1];x*p[2].+p[3]]
truef(x) = [x.+1;2*x]

function loss(p)
        dndts = [truef(x[1]) for x in data]
        dfdt(x) = f(p,x)

        ℓ = sum(abs2,reduce(hcat,[dfdt(data[i]).-dndts[i] for i in 1:10]))
        return ℓ
end

println(Zygote.gradient(loss,rand(3)))

it works correctly. I know second derivatives are complicated, so I was wondering what would be an appropriate solution for this. If it’s of any use:

I’m trying to do this where my function f is a neural network, using either the Lux or Flux environment
My function, just like in this MWE, is a single input function that returns several outputs.

mcabbott · May 31, 2023, 3:59am

Note that Zygote.forwarddiff doesn’t take a derivative. And further, it is explicitly blind to anything closed over by the function, i.e. to your p:

help?> Zygote.forwarddiff
  forwarddiff(f, x; chunk_threshold = ForwardDiff.DEFAULT_CHUNK_THRESHOLD) -> f(x)

  Runs f(x) as usual, but instructs Zygote to differentiate f using forward mode, rather than the
  usual reverse mode. 
[...]

  Note that the function f will drop gradients for any closed-over values.

  julia> gradient(2, 3) do a, b
           forwarddiff(a) do a
             a*b
           end
         end
  (3, nothing)

If you want a second derivative, using ForwardDiff outside, and Zygote inside, would probably work here.

Topic		Replies	Views
Zygote gradient General Usage	2	1061	July 24, 2019
Zygote gradients for functions with Array output General Usage	2	1023	May 16, 2020
Strange behavior of differentiating constant functions in Zygote New to Julia	1	329	December 23, 2019
Allocations in Zygote for a simple derivative Machine Learning performance , zygote	0	682	February 19, 2020
How to achieve good performance with Zygote.pushforward on a neural network Performance zygote	3	1018	May 31, 2023

Second derivative in Zygote returns nothing

Related topics