I wanted to restart writing blog posts and figured this might be a good start and I wrote up a few sentences this afternoon. Perhaps this can be of use to you: A Simple Recurrent Model in Flux | Jonathan Chassot
If self-promotion is not tolerated I’ll make sure to remove it, but I think it explains how to work out the case with both X
and y
. In general, you just want to reshape your X
and not necessarily your y
.
To summarize, I just keep both my X
and y
separated and I don’t use the Flux.train!()
function but rather compute the gradients and use Flux.update!()
. This is something that was suggested to me by someone more knowledgeable about RNNs.