This question might reveal my limited knowledge of Bayesian estimation, but I am having difficulty figuring out how to perform basic Bayesian model selection in Turing. The documentation for MCMCChains.jl has a short section on DIC, which I’m not very familiar with. How does one calculate Bayes factors, for example? I can’t even figure out how to get model Evidence from the output of sample(). (I’m using Metropolis-Hastings algorithm). Any tips or directions of where to look would be greatly appreciated.
I don’t know much about the metrics for model selection, but do know that various statisticians argue against Bayes factors (Tendeiro & Kiers, 2019; Gelman et al., 2020).
As far as I understand, McElreath (2020) and Gelman et al. (2020) advise to do model comparison based on predictive performance measures such as cross-validation and AIC. McElreath (2020) argues that these metrics are quite similar.
But, I think that it also depends on where you want to publish. My co-authors aren’t comfortable with cross-validation, so I will stick to a multiverse analysis (Steegen et al., 2016) with a multilevel model to reduce overfitting and clear graphs to show what the model does.
As far as I can tell, there are two options currently, First, there is a Julia interface to ArviZ, which is written in Python. Second, there is a package associated with the book Statistical Rethinking that implements WAIC and LOO. The downside to the second option is that it does not currently have diagnostics for the model selection criteria.
Thank you both for the links, these are very useful. I will brush up on which approaches are most appropriate for my case and see if I can implement something.
I’ll also look more closely at DIC in MCMCChains. I can’t seem to get it to work. It gives me an error that I haven’t used the correct inputs for logpdf(), and I’m not sure exactly what needs to be in there (for both the distribution and x). I’m assuming it’s asking for the likelihood function (mine is BinomialLogit). Here is the function:
Under my proposal, the function dic would create the lpfun function and perform the calculations as normal. This way the user does not need to modify functions.
@cpfiffer, would this be a better way to compute DIC or did I misunderstand the documentation?
Yeah, I think that’s probably a better way to do it. It seems like a pretty trivial thing to add that extension into MCMCChains – maybe a good first issue for somebody.
I tried to change the API, but I encountered several issues with the lpfun function. It almost seems like it was never fully tested and implemented. I opened an issue.
@ANasc, just a quick status update: DIC was added to a new repo called ModelComparisons, along with BIC and AIC. This repo also has WAIC and PSILOO. It will take a few days for registration to be approved. I will also submit a change to MCMCChains so that you can calculate DIC directly from a Chain object. If you need to use DIC immediately, you can switch to my forks for MCMCChains and ModelComparisons until new releases are official. I plan on adding functionality to hook MCMCChains into WAIC and PSILOO in the next week or so.
We encountered a few issues during the process. One is a licensing issue for PSILOO. My plan is to try to implement a new version as a work around. I am also trying to figure out the best place to have this functionality. I will follow up with an update once I know more.
The PSILOO code is under a GNU license. Other developers are reluctant to use it because under some interpretations, the GNU licence would apply to dependencies and supercede the MIT license.