Hello everyone.

I’m attempting to obtain the gradient of a Flux model wrt to the weights.

I first want to show what I mean (or want to achieve).

Suppose I have a very simple linear model such that

Now, using this model I wish to obtain the following

and

Further, in the case of a nonlinear model, given a nonlinear activation function \sigma(x), I have the nonlinear model

and I wish to obtain the following

and

I hope I haven’t made a mistake somewhere in my computations. Please, do correct me if there is a mistake.

Anyway, both linear and nonlinear models can be readily implemented using the `Dense`

layer from Flux.

What I currently have is the following MWE

```
using Flux
using Random
Random.seed!(8129)
# Create a very simple model
# Note that this is a *linear* model !!
model = Flux.Dense(3, 1)
baseline = Flux.params(model)
display(baseline)
# Compute the gradient wrt to the weights
# We should be able to obtain the same parameters as before!
some_input = rand(3, 3)
some_output = model(some_input)
display(some_output)
grad = Flux.gradient(x -> model(x), some_input) # Gradient evaluated at the inputs
display(grad)
```

but the output is just an error

```
ERROR: LoadError: Output should be scalar; gradients are not defined for output Float32[0.28850588 0.8061751 0.3004352]
```

How can I obtain the derivatives I’m looking for?