My honest answer is that you make educated guesses based on similar work and your problem, and then try lots of different things and see what performs best. This is part of the “dark magic” of deep learning: many of the essential details are not mentioned in papers, because the true story of the journey to the final architecture is usually messy and is at odds with the perfect mathematical narrative that ML papers like to portray.
Some general tips are worth bearing in mind: try to find an existing model that solves a similar problem to yours as a starting point (arXiv, GitHub), overtrain on a small training set first (no dropout, shallow network), then regularise to improve validation set performance (add layers here), and resist the temptation to iterate on the test set.