Is it okay for the layers used in a chain to be chains themselves? Such as Chain(Chain(A, B), C)? All the examples I have seen so far do not use Chain recurrently so I am thinking this may not be kosher. Does this mean that in order to build a sub-structure such as a ResNet block I need to build a custom layer through a structure and @treelike?
Simple chaining like this does not appear to work:
function ResNetBlock(nfilter=64)
return
Chain(
SkipConnection(
Chain( * [
Conv((3,3), nfilter=>nfilter, pad=1),
BatchNorm(nfilter, relu),
Conv((3,3), nfilter=>nfilter, pad=1),
BatchNorm(nfilter)),
+),
x -> relu.(x))
end
This was inspired by @jonathan-laurent AlphaZero but it does not work for me, meaning running params on returned chain gave an empty parameter list. I believe he defines his own helper functions to interface with Flux/gpu.
Weird thing is if I run the code interactively then it works
nfilter = 2
nn = Chain(
SkipConnection(
Chain(
Conv((3,3), nfilter=>nfilter, pad=1),
BatchNorm(nfilter, relu),
Conv((3,3), nfilter=>nfilter, pad=1),
BatchNorm(nfilter)),
+),
x -> relu.(x))
params(nn)
After some experimentation it appears I must not have understood return
correctly. After I got rid of the Duh, return
keyword then the function wrapper also works. I am new to Julia so I don’t know if this is a bug or feature of Julia.return
on its own line would return an empty object! So it was a newbie syntax error.
Okay chain of chains do appear to work! Great!
Don’t forget to call Flux.@functor
for custom-defined layers