# How to get second order gradient of a neural network?

Hi,

I am trying to plot the second order derivative of a single input-single output neural network. I can do the first order derivative just fine but the second order derivative computed with the Tracker package is simply 0. But it is not 0 everywhere. Maybe this is a vanishing gradient issue ? To reproduce you can quickly train a similar network with this code:

``````using Flux, Tracker, CuArrays

net = Chain( Dense(1,256, relu),
Dense(256,256, relu),
Dense(256,1,relu)) |> gpu

S = 70
s = 15
function opt(x)
if x <= s
return S-x
else
return 0
end
end

function loss(x,y)
ÿ = sum(net(x), dims = 1)
Flux.mse(ÿ,y)
end

x = hcat(collect(-50f0:0.1f0:100f0)...)
y = opt.(x)
x = cu(x)
y = hcat(y...) |> gpu
Flux.train!(loss, Flux.params(net), Iterators.repeated((x,y), 1000), op)
``````

Plot the estimated function and its derivative :

``````act(x) = sum(net(x))
act(cu)
pact(x) = act(cu([x])).data
plot(pact,0,20)

dact(x) = gradient(act, x; nest = true)
dact(cu)
pdact(x) = dact(cu[x]).data
plot(pdact, 0, 20)
``````

As you can see on the chart, the derivative is mostly flat except for a sharp negative peak around 15. The second order derivative should be a high value in this area but its graph is a plain flat 0:

``````d2act(x) = gradient((x) -> sum(dact(x)), x; nest = true) #dact returns a 1-element array so I sum over it.
d2act(cu) #should be a large value
pd2act(x) = d2act(cu[x]).data
plot(pd2act, 0, 20)
``````

If you replace the network with a small, untrained sigmoid one, say `Chain(Dense(1,20,relu), Dense(20,1,sigmoid))`, then the second order derivative is well plotted.

1 Like

Nevermind, I guess since a neural network with relu activation functions is a piecewise-linear approximation with an exponential number of pieces it makes sense that the second order derivative is 0 everywhere.

1 Like