As I understand, it’s really model- and even workflow-dependent, which makes this hard to diagnose or even describe in general. There’s been lots of questions about this here before (1, 2, 3, 4, 5, …), but most dig into specific models. Perhaps the best general discussion is in this GitHub issue:
The errors can be real and indicate a problem with the model and/or the sampler settings but they can be ignored in the initial phase when the step size is tuned - depending on the model and the initialization, it can happen that a too large step size results in a non-finite gradient of the log density.