I’m trying to dig through the model zoo examples for Flux to learn.

One of these is FizzBuzz:

Github source here

The model is defined as

`m = Chain(Dense(3, 10), Dense(10, 4), NNlib.softmax)`

To me, this looks like 2 Dense layers (i.e. nodes with parameter weights) plus one softmax layer (i.e. has no trainable parameters.

However, when I call `params(m)`

I get 4 sets of weights:

- a 10x3 (which I expected)
- a 10x1 (which I did not expect)
- a 4x10 (which I expected)
- a 4x1 (which I did not expect)

Can anyone please explain what the intepretation of the nx1 layers is? Why are they there?