Convex Neural Network Using Skip Layers in Flux.jl

marcofrancis · September 26, 2022, 5:31pm

Hi,
I’m trying to code up the following net, where y is the input of my net, f(y) is the output of the net, W^z_k, W^y_k are matrices and b_k are vectors :

f(y) = z_k
z_k = softplus(W^z_k z_{k-1}+W^y_k y+b_k)
…
z_{1} = softplus(W^y_{0} y+b_{0})
As you can see it’s basically a fully connected net, where each layer has access to the input. Furthermore, W^z_k need to be positive (elementwise) and I was thinking of enforcing this applying say an exponential basically defining W^z_k = \exp(W_k), where now W_k is an unconstrained matrix.
Somehow though I cannot find a way so that each layer in a chain keeps access to the input of the chain. Any idea on how to code this chain in Flux? I’m sure I can define a single layer that does the whole thing but it won’t be very practical.

ToucheSir · September 26, 2022, 6:31pm

If you want to use built-in layers for this see SkipConnection and Parallel: Model Reference · Flux

marcofrancis · September 26, 2022, 6:59pm

I’ve seen these, but how would you use them to create the architecture I mentioned? It’s not clear to me how to do it since after one skip layer the original input is lost

mcabbott · September 26, 2022, 7:55pm

I think this does what you write:

struct Adder{T<:Tuple}; layers::T; end;
Adder(layers...) = Adder(layers)
Flux.@functor Adder

function (a::Adder)(y)
  d1 = a.layers[1]
  z = d1(y)
  for d in a.layers[2:end]
    z = d(z + y)
  end
  z
end

m = Adder(Dense(2=>2, softplus), Dense(2=>2, softplus))
m(rand(2))

ToucheSir · September 27, 2022, 12:32am

demo_layer1(k) = y -> begin
  @info "just y" k y
  y
end

demo_layer2(k) = (z, y) -> begin
  @info "y & z" k y z
  k * (y + z)
end

# Easier to see the structure if we build up the model iteratively:
model = demo_layer1(1) # z1
model = SkipConnection(model, demo_layer2(2)) # feeding into z2
model = SkipConnection(model, demo_layer2(3)) # etc...

julia> model(1) # y = 1
┌ Info: just y
│   k = 1
└   y = 1
┌ Info: y & z
│   k = 2
│   y = 1
└   z = 1
┌ Info: y & z
│   k = 3
│   y = 1
└   z = 4
15

Note how the calculations for z_{2+} are being done in the skip connection’s “connection” and not as the (non-identity) branch, while z_1 is being done as part of that branch. There are also ways of formulating the forward pass of your layer so you only need one for both equations, but that’s an aside.

marcofrancis · September 27, 2022, 8:30am

Thanks, I’ll give a try to both solutions

Topic		Replies	Views
Best way to implement shortcut connections for feed-forward neural networks in Flux.jl Machine Learning question , package	10	5592	February 21, 2020
Custom Flux Layer Connect Machine Learning package , flux	3	732	September 6, 2023
CliqueNet and backward connection neural network Machine Learning	1	220	February 25, 2023
Implementing multiple skip connections in Flux.ml? Machine Learning flux	0	900	April 21, 2019
Creating deconvolution layers in Flux compared to PyTorch General Usage flux , pytorch	3	696	December 4, 2020

Convex Neural Network Using Skip Layers in Flux.jl

Related topics