Flux: multiple input of unequal dimensions

Assume I have data as follows

n  = 1000
y  = rand(n)
x1 = rand(10, n)
x2 = rand(20,20,1,n)

and would like to a apply one (or more) Conv layers to x2 before it is flattened and concatenated with x1 in a Dense layer. How would you implement this in Flux? Thanks.

You can probably do this with a Chain, but it could be a little awkward and there is a chance you anyways can’t make use of the functor convenience.

Im on the phone so I cant create a working example, but this thread has a couple of options and might be able to provide some inspiration: Splitting and joining Flux model chains

I had a look at the other thread. Although the problem is different, I got inspired to make a function which seems to work

n = 1000
y = rand(n)
x = (ones(10,n), rand(20,20,1,n))
m = function(x)
    x2 = Chain(Conv((2,2), 1 => 1), flatten)(x[2])
    return cat(x[1], x2, dims = 1)
size(m(x)) == (371, n)
m2 = Chain(m, Dense(371, 10))
size(m2(x)) == (10, n)

The next problem is to train models like this. Any suggestions on how to proceed? Would it be possible to use Flux.train!?

Yes, that is the general idea, but there are a few tweaks needed.

To use Flux.train! you need to supply the parameters you want to update. This is conveniently done using Flux.params for all layers in flux as well as for Chains which only have “raw” layers in them.

Unfortunately, when layers are wrapped in functions like your m above they are no longer accessible in this way. Another issue is that m will create a new Conv every time it is called, so whatever parameters the layer have will be discarded the next time the function is called.

Here is one possible way of how to work around this:

julia> function createmodel(convlayer, denselayer)
       return function(x)
       x2 = convlayer(x[2]) |> flatten
       x1 = cat(x[1], x2, dims=1)
       return denselayer(x1)
       end, params(convlayer, denselayer)
createmodel (generic function with 1 method)

julia> m, ps = createmodel(Conv((2,2), 1=>1), Dense(371, 10));

julia> Flux.train!((x, y) -> Flux.mse(m(x), y), ps, [(((ones(10,2), rand(20,20,1,2))), ones(10, 2))], Descent())

As you can see, createmodel returns not just the function to be optimized but also its parameters. You could also just as well remove that part of createmodel and create the layers first so that you have a reference to them outside of m.

A third option is to just create a hacky functor for m, like described here: Writing complex Flux Models


Thank you @DrChainsaw. I will hopefully be able try this on real data in a few weeks time and on similar problems as well with some modifications.

What would be the best way to create mini-batches with these data for Flux.train!? So far I have been using the Flux.Data.DataLoader (since it was introduced), but I am unsure whether it supports this kind of data structure.