New Bayesian Statistics tutorials with Turing

I’ve just made some Bayesian Statistics tutorials using Julia and Turing. The content is fully opensourced in GitHub and has a very permissive Creative Commons License (which means you can copy and alter the crap out it…).

My hope is to have more people into Bayesian statistics. The content is aimed towards social scientists and PhD candidates in social sciences. I chose to provide an intuitive approach rather than focusing on rigorous mathematical formulations. I’ve made it to be how I would have liked to be introduced to Bayesian statistics.

Check it out in: https://storopoli.io/Bayesian-Julia/ or in https://github.com/storopoli/Bayesian-Julia

58 Likes

Brilliant. Thanks for this. Turing is ideal for learning bayesian stats and probabilistic programming. I love that (much like Julia) it has a high level, easy syntax but you can dig deeper to access greater complexity and performance when needed.

1 Like

I looks very interesting, I’ll read it when I have some time,
Thank you.

I think you missed the “beta distribution”, very common in Bayesian Statistics.

And I have a quick question…
Most frequentist statistics only use the mean and the variance of the sample.
What about bayesian statistics models? Do they use just the mean and the variance, or do they also make use of all the information of all the measures, i.e. all the moments?

We mostly use the full posterior density. And then you can calculate any moment or any other statistic you might fancy.

2 Likes

This looks really interesting and I’ll definitively will work through the material in the next days.

@Storopoli, quick question: in https://storopoli.io/Bayesian-Julia/pages/4_Turing/ there’s this model:

@model dice_throw(y) = begin
    #Our prior belief about the probability of each result in a six-sided dice.
    #p is a vector of length 6 each with probability p that sums up to 1.
    p ~ Dirichlet(6, 1)

    #Each outcome of the six-sided dice has a probability p.
    for i in eachindex(y)
        y[i] ~ Categorical(p)
    end
end;

And then it says:

Instead, I’ve opted for a Dirichlet with a weekly informative prior towards a “fair” dice which is encoded as a Dirichlet(6,1).

  • What’s a “weekly informative prior towards a “fair” dice”?
  • Why do we choose Dirichlet(6, 1) and not, say, Dirichlet(6, 8)? As far as I can tell, all of them have mean 0.166667:
all(
	all(mean(Dirichlet(6, a)) .≈ 1/6)
	for a ∈ 0.01:.001:30
) == true

This is going to be philosophical question? I don’t know, i think a 6-length vector of 1/6?

Good question. I honestly don’t know. Looking the help:

help?> Dirichlet
search: Dirichlet DirichletMultinomial

  Dirichlet

  The Dirichlet distribution (http://en.wikipedia.org/wiki/Dirichlet_distribution) is often used as the conjugate prior for Categorical
  or Multinomial distributions. The probability density function of a Dirichlet distribution with parameter \alpha = (\alpha_1, \ldots,
  \alpha_k) is:

  f(x; \alpha) = \frac{1}{B(\alpha)} \prod_{i=1}^k x_i^{\alpha_i - 1}, \quad \text{ with }
B(\alpha) = \frac{\prod_{i=1}^k \Gamma(\alpha_i)}{\Gamma \left( \sum_{i=1}^k \alpha_i \right)},
\quad x_1 + \cdots + x_k = 1

  # Let alpha be a vector
  Dirichlet(alpha)         # Dirichlet distribution with parameter vector alpha
  
  # Let a be a positive scalar
  Dirichlet(k, a)          # Dirichlet distribution with parameter a * ones(k)

and then:

julia> mean(Dirichlet(6,1))
6-element Fill{Float64}: entries equal to 0.16666666666666666

julia> mean(Dirichlet(6,8))
6-element Fill{Float64}: entries equal to 0.16666666666666666

Dirichlet(6,8) implies you are quite confident that it’s a fair dice roll. Whereas (6,1) says “I don’t really know”

1 Like

The alpha in a Dirichlet is a “concentration”. Maybe this helps:


julia> std(rand(Dirichlet(6,1),1000),dims=2)
6×1 Matrix{Float64}:
 0.14305158706755236
 0.14043323912305447
 0.12959278663394877
 0.14360349478154502
 0.14403746553339422
 0.13586448356346292

julia> std(rand(Dirichlet(6,8),1000),dims=2)
6×1 Matrix{Float64}:
 0.05263278171058433
 0.0531789315822342
 0.0524370633051767
 0.05355511158379274
 0.05450811672204576
 0.05293590862720139

A concentration of alpha=1 gives a uniform distribution over the simplex. Also, it can help to remember that the Dirichlet distribution generalizes a Beta distribution, where alpha and beta play the same role.

6 Likes

Oh, I thought that was a term that I didn’t understand.

I messed around with the Dirichlet distribution at https://ben18785.shinyapps.io/distribution-zoo/, and it looks like when the vector of parameters is all ones, like Dirichlet(ones(3)) or Dirichlet(3, 1), we get a sort of “uniform”, flat distribution.

It seems to behave similarly to the Beta distribution: when both its parameters are high, its PDF starts looking more and more like normal with mean 0.5. When all n Dirichlet’s parameters are equal and increase, the PDF also concentrates more and more around the middle: 1/n.

So it looks to me that Dirichlet(ones(n)) has the flattest PDF and is thus the most “uniform” among Dirichlet distributions with other parameters

Yeah I was aiming for a uniform flat distribution. And also what a great demonstration by @cscherrer thanks!

1 Like

BTW, @Storopoli, these are really nice tutorials - way easier to understand than those in Turing’s documentation. So thank you!

Hey @ForceBru thanks! I was invited to give a talk to Stuttgart’s Julia Meetup Group (Login to Meetup | Meetup)

I will cover some stuff that I haven’t gone on the tutorials. It will be next Saturday and open to everybody. Will be in english.

Edit: I’ve found that Turing’s examples are more geared towards computer science/machine learning folks. Not so much as someone who had classical stats training and then learned all over again with Bayesian statistics.

1 Like

Cool, just signed up for the talk!

1 Like

updated link here: Bayesian Statistics using Julia and Turing

3 Likes

Hi @Storopoli I a new user of Turing and Julia for Bayesian modeling. I was trying to replicate a basic Linear Regression model in your GitHub link (Children’s IQ Score example). I could replicate the model but could not plot the posterior results / trace using StatsPlots.

When I type using StatsPlots and try to run in the Jupyter Notebook it gives errors stating that one of StatsPlots dependencies Arpack has not been installed correctly. See this link (I posted this under New To Julia) Arpack error with StatsPlots

Any pointers in this regard? I’m coming from R and Python based Bayesian Models using brms and rstan as well as pymc. Are there any alternatives to StatsPlots that would allow me to plot the Bayesian model results?

That is definitely a install error. Are you using Windows?
You can plot them with Makie and AlgebraOfGraphics. See here: Rewriting MCMCChains with Makie.jl+AlgebraOfGraphics · Issue #306 · TuringLang/MCMCChains.jl · GitHub.

In fact I am going to use my “winter break” (summer break in the northern hemisphere) to try to move all plots to Makie + AlgebraOfGraphics. Issue: Move images from `Plots.jl` to `Makie.jl` + `AlgebraOfGraphics.jl` · Issue #46 · storopoli/Bayesian-Julia · GitHub.

2 Likes

Thank you so much for a quick response @Storopoli

I’m on Ubuntu 21.04 using Julia 1.7.3. After posting my question on alternative graphing libraries, I found online reference to using Turing and ArviZ together. For the time being, I’m using ArviZ.

I have tried to install StatsPlots and dependencies multiple times by following different instructions on Github and Discourse based on other Linux users experiences, but so far no success. I have also opened an issue at github/StatsPlots.

I will check the reference on using Makie.jl - thanks again for your help.

1 Like

@Storopoli I have another question for you: Does Turing have a front-end equivalent such as brms for rstan in R or such as bambi for pymc in Python? Both brms and bambi permit a simple formula based approach to specifying advanced Bayesian multi-level models for example.

The brms model specification is very much like that of lme4 in R and bambi, inspired by brms follows a specification very close to StatsModels. in Python. For myself and a lot of others, both of these interfaces (brms and bambi) made Bayesian Modeling more accessible and simpler.

Would love to see something like that for Turing as well.

I would check out TuringGLM if I were you.

2 Likes