Thanks for your help! This has been confusing to think about, for now I’ll just create custom structs that mirror the Flux models and use Optimisers.destructure() to reconstruct them from hypernet outputs. To do that correctly, the parameter batch θs = H(z) generated by the hypernetwork needs to be correctly reshaped so that each module in the primary gets the slice θs[module_inds, :]. To do this over an arbitrary Flux.Chain this would need to be done recursively, maybe using fmap? I’ve tried to rework something from Optimisers.jl/destructure.jl at master · FluxML/Optimisers.jl · GitHub but think I’m out of my depth here.
Let’s say we have a primary network
p = Chain(
Parallel(
vcat,
Chain(
Dense(32, 64), # mapped to HyDense later
LayerNorm(64, elu),
),
Chain(
Dense(32, 64),
LayerNorm(64, elu),
)
),
Dense(128, 64),
LayerNorm(64, elu),
Dense(64, 10, bias=false),
)
Using destructure:
θ, re = Flux.destructure(p)
julia> offs = re.offsets
(layers = ((connection = (), layers = ((layers = ((weight = 0, bias = 2048, σ = ()), (λ = (), diag = (scale = 2112, bias = 2176, σ = ()), ϵ = (), size = ((),), affine = ())),), (layers = ((weight = 2240, bias = 4288, σ = ()), (λ = (), diag = (scale = 4352, bias = 4416, σ = ()), ϵ = (), size = ((),), affine = ())),))), (weight = 4480, bias = 12672, σ = ()), (λ = (), diag = (scale = 12736, bias = 12800, σ = ()), ϵ = (), size = ((),), affine = ()), (weight = 12864, bias = (), σ = ())),)
(I think) all that needs to be done now is to slice θs = H(z) according to the indices in offs, i.e. the HyDense that replaces the first Dense in the Parallel module would have weights w = θs[1:2048,:], b=θs[2049:2113,:] etc. How can you gather all the indices in an arbitrarily nested tuple?
Also, I’m not sure how to incorporate activity normalization layers like LayerNorm and BatchNorm into a restructured Chain, since I don’t think(?) the parameters should be produced by the hypernet H. Adding them into the Chain post-hoc as e.g. 'LayerNorm(64, elu) |> gpu` makes Zygote unhappy.
^ Looks like this issue is still up for debate