Sequence-to-one modelling and Flux.reset!

Without diagnosing the specific issue, I would strongly recommend calling reset! outside of your loss function. That unfortunately means ditching train!, but IME train! doesn’t really work for RNNs in the first place because of the assumptions it makes around input batching.

2 Likes