Hi all. I’m working on a Tutorial in which I use Turing to fit a time series model to some Economic data (nonfarm payrolls).
The model is basically a radial basis function expansion, with parameters for center locations, amplitudes, and a sigmoidal shock for the COVID pandemic… With sufficient debugging, it works reasonably well when I do an optimization:
Now I’m trying to sample the model using Turing.jl’s implementation of NUTS:
chain = sample(bmodel,NUTS(100,.85,max_depth=8,init_ϵ=2e-4),MCMCThreads(),150,2; init_theta = op.values.array)
The sampling completes, on two chains. The summary doesn’t look particularly good. As you can see, the effective sample size is about 4 after 50 real samples on 2 chains. And r_hat is quite large as well.
Summary Statistics from Chain
Summary Statistics
parameters mean std naive_se mcse ess r_hat
────────── ──────── ────── ──────── ─────── ──────── ──────
Ac[1] -0.0251 0.1477 0.0148 missing 4.3478 5.2349
Ac[2] -0.1174 0.0974 0.0097 missing 4.3478 2.6193
Ac[3] 0.6177 0.1791 0.0179 missing 4.3478 5.6394
Arb[1] 13.4318 0.1326 0.0133 missing 4.3478 2.9643
Arb[2] 0.7188 0.1091 0.0109 missing 4.3478 3.3800
Arb[3] 47.0745 0.4903 0.0490 missing 4.3478 7.1403
Arb[4] 1.4287 0.4006 0.0401 missing 4.3478 4.4474
Arb[5] 0.0224 0.0851 0.0085 missing 4.4266 2.1120
Arb[6] -26.9109 0.7010 0.0701 missing 4.3478 8.8497
Arb[7] 6.1686 0.1536 0.0154 missing 4.4141 2.4206
Arb[8] -26.8714 0.0891 0.0089 missing 4.3478 3.1630
Arb[9] 0.6893 0.1505 0.0151 missing 4.3478 4.4602
Arb[10] 3.3334 0.1706 0.0171 missing 4.3478 4.2515
Arb[11] 7.5663 0.0646 0.0065 missing 6.5470 1.4389
Arb[12] -9.2274 0.3382 0.0338 missing 4.3478 4.2497
...
Obviously, I’d like to do some plots to see what’s going on. For example a simple timeseries plot of several parameters. Of course, MCMCChains has this build in right?
plot(chain)
leads to the following window opening:
As well you might expect. The model has ~ 100 parameters, and you can’t possibly plot them all meaningfully on one plot.
In general, most of my Bayesian models have well over 10 parameters, and I’ve worked on models with tens of thousands. For example a model of economic outcomes in each of the Public Use Microdata Areas from the census. Each of those has ~ 100k people in it, and so given there are 330M people in the US, there are about 3300 PUMAs, each one needed 5 or 10 parameters to describe the local economic conditions… with geographic partial pooling, etc.
To me, the delight of Bayesian methods is that you can construct meaningful models like this which inevitably involve many parameters. What mechanisms are there to do meaningful plots by slicing and dicing the parameters into subsets etc? The examples in the MCMCChains website don’t really go beyond the basics.
Also for example: MCMCChains says:
" By convention, MCMCChains assumes that parameters with names of the form
"name[index]"
belong to one group of parameters called:name
. You can access the names of all parameters in achain
that belong to the group:name
by runningnamesingroup(chain, :name)
If the chain contains a parameter of name
:name
it will be returned as well.The function
group(chain, :name)
returns a subset of the chainchain
with all parameters in the group:name
."
But that doesn’t seem to be the case… For example in my model there’s Arb[n] for n from 1 to 40, which is the coefficients of the radial basis functions… but doing:
julia> group(chain,:Arb)
ERROR: UndefVarError: group not defined
Stacktrace:
[1] top-level scope at REPL[162]:1
[2] eval(::Module, ::Any) at ./boot.jl:331
[3] eval_user_input(::Any, ::REPL.REPLBackend) at /build/julia-pifKTc/julia-1.4.1+dfsg/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:86
[4] run_backend(::REPL.REPLBackend) at /home/dlakelan/.julia/packages/Revise/tV8FE/src/Revise.jl:1165
[5] top-level scope at none:0
I did manage to do this:
plot(chain[:,:Arb,:])
which is apparently a way to subset the variables to those in the Arb group… but then I get:
Which is still not a usable plot.
It looks like this is useful:
plot(chain[:,[Symbol(“As[$i]”) for i in 1:10],:])
which then lets me get sort of reasonable plots… (still don’t know why it lays out with weird margins…), and shows that the NUTS sampler is not moving at all well
So, perhaps some updated info on how best to use Turing to get better mixing etc. would help. I have read through the site a fair amount, and am still not clear on how the different samplers work… like NUTS vs AdvancedHMC.NUTS etc?