Convolutional GRU/LSTM in Flux?

Is there any implementation of (2D) convolutional GRU, LSTM or similar based on Flux? A similar question was raised in Sept. 2020.

1 Like

Not a complete answer, but one major change since that post is that the recurrent layers in Flux will now accept 3D arrays as an input format. Seems like that was the issue in the previous OP’s code? Maybe try that to see if it helps?

Will that somehow be able to handle data where a single input sample is 4D (2 spatial dimensions, time, variable/channel)?

Not yet. We are still iterating on some of the changes that need to happen to Recur to make this possible. What would be a viable strategy (I think) is to overwrite the behaviour of Recur{ConvGRU} by specializing:

function (m::Flux.Recur{ConvGRU})(x)
    # special behaviour here
end

Now this isn’t really scalable. I’ve been iterating on something that looks like a trait based system to unify some of the architectures. You can see some of this in this repo. I’ve been unable to work on this as much as I want due to various constraints w/ my degree/other projects/other work, so would be interested to have a basic implementation of a ConvRNN/ConvGRU/ConvLSTM that we can use to iterate.

2 Likes

I would also recommend the time dimension being the last, as I think we have unofficially agreed this to be standard (see Recurrent network interface updates/design · Issue #1678 · FluxML/Flux.jl · GitHub).