I have these two settings
NUTS(500, 0.65) and
NUTS(0, 0.65) and get very different results. Is it expected? if yes why?
I was expecting both to be similar if not exact as both run for 1500 particles.
chain = sample(model, NUTS(500, 0.65), MCMCThreads(), 1000, 1; save_state = true)
chain = sample(model, NUTS(0, 0.65), MCMCThreads(), 1500, 1; save_state = true)
Yes, the NUTS algorithm spends a bunch of effort to find an appropriate step size and mass matrix and to get into the high probability region during warmup, you have told it to do no warmup at all and that will be basically disastrous. I typically do 100-500 warmup steps as a matter of course.
ohh okay. Is there any percentage wrt iteration number for warmup or 100-500 works for even long iterations?
It’s not so much percentage it’s just that the warmup needs to go far enough that it has converged into the high probability region. what determines the needed steps is the geometry of the high probability manifold. If it requires a lot of curving around in high dimensional space then it will need more time if it’s a simple low dimensional model it may converge faster.
Is there any way that we do these warmup but can still get chain plot from 0 rather from end of warm up?
I’m not sure. During warmup the process is adaptive and doesn’t have a stationary distribution yet. You normally don’t want those steps.