(Flux/Lux) Custom Layers as Functions of Other Layers

Hey everyone. For some experiments, I want to set up a network where some of the weights are not trained, but instead are given as functions of other (trained) weights in my network.

For a simple example, take the NN of the following image: I would like to be able to force W3=2*W1 and W4=3*W2, then train W1 and W2 normally.

image

Is this possible within the SciML environment? I’m quite new to Julia and SciML as a whole, so I honestly wouldn’t even know how to begin. The Flux page on Custom Layers, opaque as I find it to be, doesn’t seem to consider this possibility.

I think in this case using the same layer objects two times each could work?

1 Like

While that could possibly work, it does not seem to generalize to cases where the mapping is not just the identity, which is what actually interests me. I’ll edit the question to better reflect that!

Is your diagram doing something like this?

struct Diamond{T}  # store two matrices
    W1::T
    W2::T
end

Flux.@functor Diamond  # make sure Flux can see them

function (d::Diamond)(A)  # write out the forward pass
    B = d.W1 * A
    C = d.W2 * A
    D1 = 2 * d.W1 * B
    D2 = 2 * d.W2 * C
    D1 + D2  # assume D is the sum of the two inputs
end

m = Chain(Dense(10=>10, relu), Diamond(randn32(10,10), randn32(10,10)))

m(rand32(10))  # it runs

It would be fine to have say D1 = 2 * (d.W1 .^ 2) * B, or some other function of W1 before using it a second time.

2 Likes

I’m not sure what it would look like in Flux, but in Lux you could just pass identical or modified versions of the parameter objects to both layers