[ANN] TuringGLM.jl: A simple @formula macro for specifying Bayesian models in Turing.jl

I am excited to announce TuringGLM.jl!

It uses the well-known @formula macro from JuliaStats ecosystem (StatsModels.jl, GLM.jl and MixedModels.jl) to specify a Bayesian model and returns a ready to be sampled/run Turing.jl model.

It is still a proof-of-concept package and many functionalities are still in the roadmap.
It is also heavily inspired by python’s bambino (PyMC) and R’s brms (Stan).


@formula

The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign + .

Example:

@formula(y ~ x1 + x2 + x3)

Moderations/interactions can be specified with the asterisk sign * , e.g. x1 * x2 . This will be expanded to x1 + x2 + x1:x2 , which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2 .

Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula , where term is the independent variable and group is the categorical representation (i.e., either a column of String s or a CategoricalArray in data ). You can specify a random-intercept with (1 | group) .

Example:

@formula(y ~ (1 | group) + x1)

Data

TuringGLM supports any Tables.jl -compatible data interface. The most popular ones are DataFrame s and NamedTuple s.

Supported Models

TuringGLM supports non-hiearchical and hierarchical models. For hierarchical models, only single random-intercept hierarchical models are supported.

For likelihoods, TuringGLM.jl supports:

  • Gaussian() (the default if not specified): linear regression
  • Student() : robust linear regression
  • Logistic() : logistic regression
  • Pois() : Poisson count data regression
  • NegBin() : negative binomial robust count data regression

Tutorials

We have several tutorials in the documentation at Home · TuringGLM.jl.

Acknowledgements

I would like to thank the whole TuringGLM.jl dev team, specially @yebai for the support and trust. Also, @rikh was fundamental in the whole development and I probably could not have done without him. Thanks!

18 Likes

Congratulations! The package is amazing!

Is there a way to extract the underlying Turing model produced by this?

1 Like

Not yet. But in the roadmap.

2 Likes

It looks like R. Can I set prior distribution on parameters?
And I enjoyed watching your video on youtube!

Yes, it is a gateway drug to R users to get into Julia and Turing.jl

Yes, check Custom Priors · TuringGLM.jl and the docstring API reference · TuringGLM.jl.

1 Like

How can I tell which parameters correspond to the @formula? E.g. if we consider the cheese example, how could I tell which of the z_i[n] corresponds to the type of cheese?

After figuring it out, I’d be happy to PR an addition to the docs to help users like myself.

2 Likes

Is there plan for less macro based construction?

1 Like

@Alec_Loudenback,

Under the hood TuringGLM converts the strings to integers using this specific function which is called here and used in the turing_model function here.

Unfortunately, the ordering is the original ordering of first appearance in the grouping variable.

A PR ideia would be to do a unique then sort before so that the z[n] follows an alphabetical (for strings) or ascending (for ints) order.

We used the @formula from the Julia’s Stats ecosystem so we don’t have to reinvent the wheel.
But we would glad to take ideas (and PRs) regarding diffferent ways to build models apart from a @formula.