Reusing exact same layer and parameters

I want to reuse the exact same layer in a network. But I can’t figure out whether my naive approach will do that. My toy architecture is

using Flux
D1 = Dense(2,2)
D2 = Dense(2,2)
NaiveReuse = Chain(D1, Parallel(vcat,Chain(D2,Parallel(vcat, D1, identity)), identity))

The output of params(NaiveReuse) is

Params([Float32[-1.1226765 0.9502689; 0.6875402 0.4517343], Float32[0.0, 0.0], Float32[-0.18272986 -0.16167739; 0.46781456 1.2025808], Float32[0.0, 0.0]])

but I’m having trouble interpreting that. It looks like only two matrices are being stored. Am I correct to assume that the parameters for D1 will be reused and properly updated by Zygote during training? If not, how would I go about that?

The Dense and other layer constructors use Random.GLOBAL_RNG in initialisation by default. So your naive approach won’t work. There is a user interface point for specifying the RNG. This may be in the docs somewhere, but see Expose RNG in initializers by findmyway · Pull Request #1292 · FluxML/Flux.jl · GitHub .

Or, you can just do D2 = deepcopy(D1), unless you want to avoid deep copies for some memory/performance reason.

Or perhaps I misunderstood. Do you want the the weights to be coupled?


Yes, I want the weights to be coupled.
As in
\sigma(D1+\sigma(D1+D2)) instead of \sigma(D1+\sigma(D1^\prime+D2)) if D1 and D2 were variables,i.e., I wouldn’t want the former to be implicitly changed to the latter.