Hi, I am in the early stage of learning to code Neural Networks using Flux.
However I thought to use Bayesian Neural Network (BNN), Both for the sake of overcoming the problem of overfitting and need a way to explain model uncertainity.
I have looked for Julia packages used for BNN, and found just two resources to start with:
- Bayesian Neural Networks · ADCME
- Bayesian Neural Networks
Are these the only two resources available?
Is there a Flux version of these?
What are the differences between these two?
I don’t think you need a special Flux version. One of the nice things with Julia is that packages generally work well together. The second link you posted shows how to use Flux in conjunction with Turing.
I think I would start with Turing + Flux since they are both very actively maintained. whereas ADCME’s last release is from May 2021. This is not my field, so I don’t know what the main difference between them is.
By the way your first link points to ADCME v0.55 instead of 0.73 and your second link points to Turings dev docs rather than the stable release docs. You can select the version of the documentation from dropdown menus on both pages.
ADCME is a package which is focused largely on providing automatic differentiation (AD) routines, whereas Turing is a general Probabilistic Programming Language (PPL) that allows you write models that takes priors + data and produce posterior distributions (i.e., in the case of BNNs, take a randomly initialized set of prior parameter distributions and transform them into posterior distributions through neural network training). Flux uses a different backend for AD called Zygote.jl, which is just a different AD library to perform backpropagation on your neural net.
I agree that the best way to start is with a combination of Flux and Turing (notice that the Turing tutorial explicitly uses Flux anyway!). These are general and actively maintained libraries that also have the ability to extend easily into other projects you may come across.
TL;DR: ADCME.jl is an AD library first, and Flux already has an AD backend that you don’t need to explicitly worry about when writing Flux code. Turing.jl allows you to turn “standard” networks into Bayesian ones. The Turing.jl tutorial you found is the way to start.
Thank a lot. That clarifies a lot. One more question. I had a bit of experience of using STAN for Bayesian models. It uses NUTS as one of the algorithms. Does Turing.jl also allows for NUTS. Reading through some example codes, I could see HMC (although NUTS is a special version of it). Just curious to know the difference between STAN and Turing.