produces an entirely result than m = Chain(x -> Dense(5, 2)(x), Dense(2, 2) Dense(2, 5))
are anonymous functions somehow not supported for Chain?
Also, I do not mean different as in like they randomly generated value when initialized are different, I mean like when you seed the random number generator and try to train one versus the other, the anonymous function version simply not only fails to produce the same output but even after more training just does not accomplish any kind of meaningful fitting of the data.
For example, try replacing in any model a Dense layer with a layer written as x → Dense(…)(x) and see if your model works as it did before.
params can’t find the parameters inside the dense layer used in the anonymous function. You need to manually add those parameters to the set of parameters being trained
How exactly would you do that? Are you saying I need to store params(layer1) and have that as an input to the train! function if I were to use an anonymous function?
Any trainable parameters closed over by an anonymous function must be manually extracted and added to the params sent to train. If you don’t want to do this, define a callable struct and run @treelike on it, this is how Dense is defined
One thing: it looks like that in order to concatenate the two, I would need to call collect() in order to access that Array as otherwise the return type of params() is Params(). Once I call collect, I can vcat() the arrays but I’m not sure how to add the Params type back, is there a way to do this?
Meaning, the layer that params() does not understand is the anonymous function, but I guess you do not substute the anonymous function for layer, like params((m, x -> Dense(5, 2)(x))
!