Gradient of NN not changing with different inputs

edwinb-ai · March 3, 2021, 7:27pm

While playing around with some neural networks (NN) I found out this

using Flux
using Random
using Zygote
using ForwardDiff

Random.seed!(9120)

n = 1
m = 5
hidden = 10

x, y = rand(m), rand(n) # some data
model = Flux.Chain(Flux.Dense(m, hidden), Flux.Dense(hidden, n))

# Gradient with ForwardDiff
g = z -> ForwardDiff.gradient(w -> model(w)[1], z)
# Getting the weights of the model as an array
ps, re = Flux.destructure(model)

display(g(x)) # Checking with original data
display(g(ps[1:m])) # Checking with weights

# Gradient with Zygote
gs = Zygote.gradient(w -> model(w)[1], rand(m)) # Notice a different random vector
display(gs)
gs = Zygote.gradient(w -> model(w)[1], zeros(m)) # Now with zeros
display(gs)

Now, the result is always the same

5-element Array{Float64,1}:
 -0.7097914769304869
 -0.14294694831147323
 -0.04312831631528913
  0.2866390831645096
 -0.4046597463981584
5-element Array{Float32,1}:
 -0.7097915
 -0.14294694
 -0.043128345
  0.2866391
 -0.40465972
(Float32[-0.7097915, -0.14294693, -0.04312831, 0.28663906, -0.40465975],)
(Float32[-0.7097915, -0.14294693, -0.04312831, 0.28663906, -0.40465975],)

Is there a reason why this is the case? I should be inclined to believe that this is because
the input is not being evaluated at all.
Does this mean that the gradient taken is with respect to the weights of the model?

oxinabox · March 3, 2021, 8:08pm

Your model is linear.
A linear model has the same gradient everywhere.

To make a nonlinear model you need to pass an activation function into Dense as the 3rd arg.
Otherwise it defaults to identity and thus you get a linear model.

https://fluxml.ai/Flux.jl/stable/models/layers/#Flux.Dense

edwinb-ai · March 3, 2021, 8:12pm

That’s it. How could I have missed that? Thanks a lot!

Topic		Replies	Views
Flux loss: Gradient wrt input leads to empty gradient wrt parameters or to "can't differentiate foreigncall" Machine Learning flux , forwarddiff , diffeqflux	3	558	April 8, 2022
Gradient of Flux model wrt to weights Machine Learning flux	4	1537	May 19, 2021
How to use gradient of neural network as the loss function? Machine Learning question	13	2739	March 23, 2021
Gradient error in Flux model inputs Machine Learning question , flux , zygote	5	1324	January 13, 2021
Unrecognized gradient using Zygote for AD with Universal Differential Equations General Usage differentiation , pde , zygote	30	2433	October 13, 2021

Gradient of NN not changing with different inputs

Related topics