I’m trying to dig through the model zoo examples for Flux to learn.
One of these is FizzBuzz:
Github source here
The model is defined as
m = Chain(Dense(3, 10), Dense(10, 4), NNlib.softmax)
To me, this looks like 2 Dense layers (i.e. nodes with parameter weights) plus one softmax layer (i.e. has no trainable parameters.
However, when I call
params(m) I get 4 sets of weights:
- a 10x3 (which I expected)
- a 10x1 (which I did not expect)
- a 4x10 (which I expected)
- a 4x1 (which I did not expect)
Can anyone please explain what the intepretation of the nx1 layers is? Why are they there?