[ANN] Announcing MarginalLogDensities.jl

Hi all, I would like to announce a the initial release of a package that’s been in (sporadic) development for a while now, MarginalLogDensities.jl!

This package was originally motivated by these discussions about a Julia equivalent for TMB. The goal is to make it simple to fit statistical models with latent variables or random effects by integrating them out and focusing only on the parameters of real interest.

There are other Julia packages for mixed models, but the goal of this one is generality: all you need is a log-likelihood or log-density function with the signature f(u, data) (where u are the parameters and data is the data), and a set of indices for the variables in u that you want to integrate out. (I use the convention that \mathbf{u} = \mathbf{v} \cup \mathbf{w}, where \mathbf{v} are the variables of interest/fixed effects and \mathbf{w} are the nuisance parameters/random effects).

You then wrap your function in a MarginalLogDensity object, which can be called like the original–with the exception that it only depends on \mathbf{v}, integrating over \mathbf{w} under the hood. By default this is done via the Laplace approximation, though you can also use numerical cubature if the dimensionality of \mathbf{v} isn’t too high. See the very basic example below:

using MarginalLogDensities, Distributions, LinearAlgebra
N = 3
σ = 1.5
dist = MvNormal(σ^2 * I(3))
data = (N=N, dist=dist)

function logdensity(u, data)
    return logpdf(data.dist, u)

u = rand(3)       # arbitrary initial values
iw = [1, 3]       # indices of parameters to marginalize
initial_v = [1.0] # arbitrary starting value
length(initial_v) == N - length(iw) # true
marginal_logdensity = MarginalLogDensity(logdensity, u, iw, data)

# Check against the analytic marginal density:
marginal_logdensity(initial_v, data) ≈ logpdf(Normal(0, σ), only(initial_v)) # true

There is also a method defined for Optim.optimize, making it easy to find MLE/MAP estimates for v:

using Optim
fit = optimize(marginal_logdensity, initial_v, data)

This optimization is like a continuous version of the expectation-maximization algorithm. (NB, from here, it’s only a hop, skip and jump to doing full INLA…) For an application in a more realistic hierarchical regression problem, see the self-contained example here.

MarginalLogDensities.jl makes heavy use of the tools in Optimization.jl and SparseDiffTools.jl, and can exploit the sparsity structure of your function to make the Hessian calculations and Laplace approximations highly efficient. Automatic sparsity detection is still a bit hacky, and I’m sure there are ways various parts can be better optimized. But I figured the package was solid enough at this point to get some more eyes on it, both to point out any obvious mistakes or performance problems (I’m not actually an expert in this stuff!) and to judge the level of interest in developing it further. Happy to answer any questions!