Hi all. I’m working on a Tutorial in which I use Turing to fit a time series model to some Economic data (nonfarm payrolls).
The model is basically a radial basis function expansion, with parameters for center locations, amplitudes, and a sigmoidal shock for the COVID pandemic… With sufficient debugging, it works reasonably well when I do an optimization:
Now I’m trying to sample the model using Turing.jl’s implementation of NUTS:
chain = sample(bmodel,NUTS(100,.85,max_depth=8,init_ϵ=2e-4),MCMCThreads(),150,2; init_theta = op.values.array)
The sampling completes, on two chains. The summary doesn’t look particularly good. As you can see, the effective sample size is about 4 after 50 real samples on 2 chains. And r_hat is quite large as well.
Summary Statistics from Chain
Summary Statistics parameters mean std naive_se mcse ess r_hat ────────── ──────── ────── ──────── ─────── ──────── ────── Ac -0.0251 0.1477 0.0148 missing 4.3478 5.2349 Ac -0.1174 0.0974 0.0097 missing 4.3478 2.6193 Ac 0.6177 0.1791 0.0179 missing 4.3478 5.6394 Arb 13.4318 0.1326 0.0133 missing 4.3478 2.9643 Arb 0.7188 0.1091 0.0109 missing 4.3478 3.3800 Arb 47.0745 0.4903 0.0490 missing 4.3478 7.1403 Arb 1.4287 0.4006 0.0401 missing 4.3478 4.4474 Arb 0.0224 0.0851 0.0085 missing 4.4266 2.1120 Arb -26.9109 0.7010 0.0701 missing 4.3478 8.8497 Arb 6.1686 0.1536 0.0154 missing 4.4141 2.4206 Arb -26.8714 0.0891 0.0089 missing 4.3478 3.1630 Arb 0.6893 0.1505 0.0151 missing 4.3478 4.4602 Arb 3.3334 0.1706 0.0171 missing 4.3478 4.2515 Arb 7.5663 0.0646 0.0065 missing 6.5470 1.4389 Arb -9.2274 0.3382 0.0338 missing 4.3478 4.2497 ...
Obviously, I’d like to do some plots to see what’s going on. For example a simple timeseries plot of several parameters. Of course, MCMCChains has this build in right?
leads to the following window opening:
As well you might expect. The model has ~ 100 parameters, and you can’t possibly plot them all meaningfully on one plot.
In general, most of my Bayesian models have well over 10 parameters, and I’ve worked on models with tens of thousands. For example a model of economic outcomes in each of the Public Use Microdata Areas from the census. Each of those has ~ 100k people in it, and so given there are 330M people in the US, there are about 3300 PUMAs, each one needed 5 or 10 parameters to describe the local economic conditions… with geographic partial pooling, etc.
To me, the delight of Bayesian methods is that you can construct meaningful models like this which inevitably involve many parameters. What mechanisms are there to do meaningful plots by slicing and dicing the parameters into subsets etc? The examples in the MCMCChains website don’t really go beyond the basics.
Also for example: MCMCChains says:
" By convention, MCMCChains assumes that parameters with names of the form
"name[index]"belong to one group of parameters called
:name. You can access the names of all parameters in a
chainthat belong to the group
If the chain contains a parameter of name
:nameit will be returned as well.
group(chain, :name)returns a subset of the chain
chainwith all parameters in the group
But that doesn’t seem to be the case… For example in my model there’s Arb[n] for n from 1 to 40, which is the coefficients of the radial basis functions… but doing:
julia> group(chain,:Arb) ERROR: UndefVarError: group not defined Stacktrace:  top-level scope at REPL:1  eval(::Module, ::Any) at ./boot.jl:331  eval_user_input(::Any, ::REPL.REPLBackend) at /build/julia-pifKTc/julia-1.4.1+dfsg/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:86  run_backend(::REPL.REPLBackend) at /home/dlakelan/.julia/packages/Revise/tV8FE/src/Revise.jl:1165  top-level scope at none:0
I did manage to do this:
which is apparently a way to subset the variables to those in the Arb group… but then I get:
Which is still not a usable plot.
It looks like this is useful:
plot(chain[:,[Symbol(“As[$i]”) for i in 1:10],:])
which then lets me get sort of reasonable plots… (still don’t know why it lays out with weird margins…), and shows that the NUTS sampler is not moving at all well
So, perhaps some updated info on how best to use Turing to get better mixing etc. would help. I have read through the site a fair amount, and am still not clear on how the different samplers work… like NUTS vs AdvancedHMC.NUTS etc?