Flux. Pooling followed by Dense

Emmanuel-R8 · March 4, 2022, 10:05am

I am at a loss about the following. I am building a small 2D network where the last 2 layers are a MaxPool followed by a Dense. The parameters to create MaxPool do not care about the size of the input. But the Dense just afterwards do.

I have tried many ways to extract this work but without success. (A Flux.flatten is in between) . Hoping for guidance elsewhere, I have gone through the source code of Flux and NNLib. Every single example hard-codes the input size of Dense or cheats with keeping the stride of the MaxPool at 1.

I am sure I am miising something stupid.

CarloLucibello · March 4, 2022, 10:37am

What exactly is the question?

Emmanuel-R8 · March 4, 2022, 10:39am

How to append a Dense layer after a Pooling layer when I don’ t know what the size coming out of MaxPool is and Dense requires to know it.

albheim · March 4, 2022, 10:45am

You can do something like this if you don’t want to calculate it manually. Otherwise it shouldn’t be too hard to figure out an expression for the output shape based on the stride and padding of the maxpool in combination with the size of the output from the convolution (which might also need some calculation).

julia> xs = rand(Float32, 100, 100, 3, 50);

julia> layer1 = Conv((5,5), 3 => 7, relu; bias = false)
Conv((5, 5), 3 => 7, relu, bias=false)  # 525 parameters

julia> layer2 = MaxPool((5, 5), pad=SamePad())
MaxPool((5, 5), pad=2)

julia> layer3 = flatten
flatten (generic function with 1 method)

julia> tmp = Chain(layer1, layer2, layer3)
Chain(
  Conv((5, 5), 3 => 7, relu, bias=false),  # 525 parameters
  MaxPool((5, 5), pad=2),
  Flux.flatten,
)

julia> size(tmp(xs))
(2800, 50)

julia> layer4 = Dense(size(tmp(xs), 1), 3)
Dense(2800, 3)      # 8_403 parameters

julia> model = Chain(layer1, layer2, layer3, layer4)
Chain(
  Conv((5, 5), 3 => 7, relu, bias=false),  # 525 parameters
  MaxPool((5, 5), pad=2),
  Flux.flatten,
  Dense(2800, 3),                       # 8_403 parameters
)                   # Total: 3 arrays, 8_928 parameters, 35.414 KiB.

Emmanuel-R8 · March 4, 2022, 11:51am

Thanks for putting the time in this answer.

I have tried all of that.
I did define a generic struct where you stop in the constructor to extract the size… Dosn’t work because no information about the input at that point of the definition.
I then did this within a function being the “forwarding” function. It creates a model fine. But applying to an actual input always bombs out. (Including Zygote trying to differentiate the size.)
The forwarding function is where it has to happen. But this is where I cannot find a way to extract the size of the output from MaxPool.

I have a helper function to generate size and iterators of convolutions. It works everywhere in the code, but there. Never gives me a consistent correct result.

I might end up creating a replicate of MaxPool or getting rid of it. But this is not really a satisfying answer.

P.S.: Don’t get me started of Zygote complaining about mutating arrays. That has been biting on and on.

albheim · March 4, 2022, 1:10pm

Hmm, I’m not sure I understand.

But what do you know about your input? You know it will be RGB images of the same dimensions?

So you create a function that does the forward pass, and halfway through you try to check what the current size of the data is and generate a dense layer corresponding to that size? That seems like it would be problematic in many ways.

This is what I was thinking would be the nicer solution. What is the problem, can you share the function and a case where it gives the wrong answer?

CarloLucibello · March 4, 2022, 1:27pm

@Emmanuel-R8 maybe you are looking for Flux.outputsize? It is still not clear to me what you are trying to achieve, a code example would help

ToucheSir · March 4, 2022, 5:09pm

Let me flip the question around and ask: how would you do this with Python libraries? Because if you can articulate that, then the Flux solution will be a pretty direct translation.

For example, with TF/PyTorch your options are:

Fix the input size and derive the pre-dense output size from that. If you want help from something like tf.Keras’ shape inference, check out Flux.outputsize as @CarloLucibello mentioned.
Use adaptive or global pooling. Unlike normal pooling layers, these generate a fixed size output for variable-sized inputs. Global pooling in particular is bread-and-butter for most vision models, but Flux has layers for both types.

Emmanuel-R8 · March 5, 2022, 6:15am

Thanks. I went with the adaptative layer option.

I uploaded the code to https://github.com/Alba-Intelligence/SharpenedCorrelation.

The idea is to implement a variant of the Sharpened Cosine Similarity. See https://www.rpisoni.dev/posts/cossim-convolution/ for a description, and https://e2eml.school/scs.html for various implementations.

Now, I reached the dreaded Zygote complaints about mutating arrays.

ToucheSir · March 8, 2022, 5:10pm

As always, this is where at bare minimum a full stacktrace would be necessary, if not a MWE as well.

Emmanuel-R8 · March 9, 2022, 2:44am

Of course. I wrote that with a lot of negative feelings dreading the incoming hair pulling.

(any questions about it would warrant a separate thread anyway)

Topic		Replies	Views
Flux: multiple input of unequal dimensions Machine Learning flux	4	1299	September 7, 2020
Flux: Feed minibatch into Neural Network New to Julia flux	1	510	December 10, 2019
Dense Layers, softmax, relu New to Julia	4	2432	March 4, 2020
Is there something better than `reshape` for flattening data batches? Machine Learning	3	595	December 2, 2019
Why do some Flux models train in parallel but not others? Machine Learning	16	931	August 10, 2022

Flux. Pooling followed by Dense

Related topics