Are there guidelines or rules of thumb on how to stack hidden layers in a RNN?

jgreener64 · December 8, 2023, 10:59pm

My honest answer is that you make educated guesses based on similar work and your problem, and then try lots of different things and see what performs best. This is part of the “dark magic” of deep learning: many of the essential details are not mentioned in papers, because the true story of the journey to the final architecture is usually messy and is at odds with the perfect mathematical narrative that ML papers like to portray.

Some general tips are worth bearing in mind: try to find an existing model that solves a similar problem to yours as a starting point (arXiv, GitHub), overtrain on a small training set first (no dropout, shallow network), then regularise to improve validation set performance (add layers here), and resist the temptation to iterate on the test set.

Topic		Replies	Views
How to build Stacked RNN in Flux.jl? Machine Learning flux	1	262	June 18, 2024
Simple Flux LSTM for Time Series Machine Learning question , flux , time-series , machine-learning	62	13548	April 11, 2022
Need help with multivariate multi-step timeseries prediction using RNN Machine Learning	0	313	February 10, 2021
Am I using LSTMs wrong? Machine Learning flux , lstm	2	599	October 25, 2021
NN: how to choose the layers/neurons? Machine Learning	2	472	May 15, 2020

Are there guidelines or rules of thumb on how to stack hidden layers in a RNN?

Related topics