Finding MAP estimate in Turing?

I’m working on one of my Data Analysis tutorials. This one involves fitting a nonlinear function to an economic time series.

The basic idea is that I have a radial basis function + some annual periodicity + a step function that handles the recent COVID shock.

I’m first running an optimize to try to get a decent fit, and then using that as a starting point for bayesian sampling…

ONCE I got it to fit a really reasonable fit… but every time since then, this is the kind of thing I see:

This is often after TENS OF MINUTES of optimizing (100k iterations)

Where red is the model and blue is the data.

I’ve tried various methods in the Turing optimize, such as LBFGS() and ParticleSwarm and whatnot. In general it does a terrible job.

Turing also doesn’t sample well at all. It usually is stuck out in the weeds just like the optimization.

I assume this is probably because of local optima. Is there a way to use BlackBoxOptim with Turing models? How about supplying an initial condition? If I supplied 0 for the RBF function it’d be way closer to the optimum than even the result of this minutes of computing.

Also, it’d be useful to be able to take the result of one of these optimizations and perturb it, and use that as a new starting point for another optimization. Is there a way to specify an initial value?

EDIT: Note that the shape of the function is linear in everything except the location of the shock… and all the priors are normal except the prior on the standard deviation of the error, so it feels like it should be relatively easy to optimize. I don’t understand why this is struggling so hard.

Here is a screenshot of the one time the optimization worked:


AHA! I just figured out that the Gamma distribution is parameterized as Shape, Scale and not Shape, Rate (as in R)… so fixing that may solve my problem since the prior for the size of the errors was parameterized by a gamma which was supposed to have small scale…

Also discovered some plain old typos/bugs. Looking forward to seeing what happened with the sampling run from last night…

Ok, it seems that having fixed some bugs at least the fits aren’t ridiculous after optimization, and it’s reliably a reasonable starting spot:


I’ll open a different question related to how to usefully work with chains with tens to hundreds of parameters.