I’m attempting to obtain the gradient of a Flux model wrt to the weights.
I first want to show what I mean (or want to achieve).
Suppose I have a very simple linear model such that
Now, using this model I wish to obtain the following
Further, in the case of a nonlinear model, given a nonlinear activation function \sigma(x), I have the nonlinear model
and I wish to obtain the following
I hope I haven’t made a mistake somewhere in my computations. Please, do correct me if there is a mistake.
Anyway, both linear and nonlinear models can be readily implemented using the
Dense layer from Flux.
What I currently have is the following MWE
using Flux using Random Random.seed!(8129) # Create a very simple model # Note that this is a *linear* model !! model = Flux.Dense(3, 1) baseline = Flux.params(model) display(baseline) # Compute the gradient wrt to the weights # We should be able to obtain the same parameters as before! some_input = rand(3, 3) some_output = model(some_input) display(some_output) grad = Flux.gradient(x -> model(x), some_input) # Gradient evaluated at the inputs display(grad)
but the output is just an error
ERROR: LoadError: Output should be scalar; gradients are not defined for output Float32[0.28850588 0.8061751 0.3004352]
How can I obtain the derivatives I’m looking for?