Newbie: Gradient of a gradient performance in Zygote


I am a newbie to Julia. I am trying to implement a piece of code as below. I was wondering if there is a way to improve the performance of gradient of a gradient.

Note: for simplicity, both x and y are the same array, however, in my real case, they are obviously different


using Zygote

Sigmoid(x) = 1/(1+exp(-x))
WeightedLoss(x,y) = -1.5 .* y*log(Sigmoid(x)) -1 .* (1-y)*log(1-Sigmoid(x))

#Zygote automatic differentiation
f(x,y) = Zygote.gradient(WeightedLoss,x,y)[1]
g(x,y) = Zygote.gradient(f,x,y)[1]

nb = 1000
arr = map(i->convert(Float64,i),1:nb)
x = arr

@time Grad = f.(x, x)
@time Hess = g.(x, x)

 0.088668 seconds (248.25 k allocations: 12.296 MiB)
1.245433 seconds (4.04 M allocations: 169.133 MiB, 5.11% gc time)

Hello and welcome to the community!
You can try the built-in Hessian function
it uses forward-mode differentiation for the outer Jacobian calculation, which can often be much more performant than reverse over reverse mode AD.
Also, try BenchmarkTools.jl for accurate timings.

Thanks, this worked. For anyone else running into this, I found the below links useful