Hey everyone. For some experiments, I want to set up a network where some of the weights are not trained, but instead are given as functions of other (trained) weights in my network.
For a simple example, take the NN of the following image: I would like to be able to force W3=2*W1 and W4=3*W2, then train W1 and W2 normally.
Is this possible within the SciML environment? I’m quite new to Julia and SciML as a whole, so I honestly wouldn’t even know how to begin. The Flux page on Custom Layers, opaque as I find it to be, doesn’t seem to consider this possibility.
While that could possibly work, it does not seem to generalize to cases where the mapping is not just the identity, which is what actually interests me. I’ll edit the question to better reflect that!
struct Diamond{T} # store two matrices
W1::T
W2::T
end
Flux.@functor Diamond # make sure Flux can see them
function (d::Diamond)(A) # write out the forward pass
B = d.W1 * A
C = d.W2 * A
D1 = 2 * d.W1 * B
D2 = 2 * d.W2 * C
D1 + D2 # assume D is the sum of the two inputs
end
m = Chain(Dense(10=>10, relu), Diamond(randn32(10,10), randn32(10,10)))
m(rand32(10)) # it runs
It would be fine to have say D1 = 2 * (d.W1 .^ 2) * B, or some other function of W1 before using it a second time.