Ridge Regression using Turing.jl

Michaela_Maskova · December 25, 2020, 5:19pm

Hi everyone,

I’m trying to create a model in Turing.jl for Ridge Regression to use HMC. However, I need to sample two parameters, alpha and theta which are dependant on each other via

θ ~ MvNormal(zeros(3),α^(-1))
α ~ Gamma(γ+d/2,δ+norm(θ)^2)

(γ, δ, d are hyperparameters of the model)

I can’t find a way to do this in Turing.jl (compose @model) that can use this kind of sampling - I thought I could use some initial value for alpha or theta, but I don’t know how to do this for the first step.

I want to use Hamiltonian Monte Carlo and compare it to Gibbs sampler which I use as

iter = 1000
α = zeros(iter+1)
θ = zeros(iter+1,3)
α[1] = α0
θ[1,:] = θ0

for i in 1:iter
    E = (X'*X)^(-1)*X'*y
    D = Symmetric((X'*X + α[i]*I)^(-1))
    θ[i+1,:] = rand(MvNormal(E,D))

    a = δ + d/2
    b = γ + norm(θ[i+1,:])^2
    α[i+1] = rand(Gamma(a,b))
end

where X is a matrix of ones and vectors x1, x2 and y are target values. Any help would be appreciated. Thanks a lot.

cscherrer · December 26, 2020, 4:50pm

This is a really interesting question. PPLs almost always express joint distributions in a way that ~ statements correspond to a factorization of the posterior density. So while a PPL might output a Gibbs sampler like you describe, they’re really not set up to take one as input.

So the more common way to do this would be to either sample θ and then α|θ, or first α and then θ|α. Unfortunately, to get there I think you’d need to crank through some things by hand, like factoring

f(\alpha) f(\theta | \alpha) = f(\theta) f(\alpha | \theta)

Both sides of this are equal to f(\alpha, \theta), and the result of this factorization would give you some options for feeding it to Turing.

BTW there’s also BayesianLinearRegression.jl that might be helpful. It’s very fast, but it’s focused on a marginal likelihood approach, so no gamma priors.

anon92994695 · December 26, 2020, 5:23pm

Do you have a reference for this phrasing of ridge regression? I’ve never seen it posed this way. Google yields some results, but I’d like to follow along.

cscherrer · December 26, 2020, 5:26pm

It’s based on the approach described in Chris Bishop’s book

EDIT: Oh wait, I thought you were asking about the BayesianLinearRegression implementation. But maybe you meant @Michaela_Maskova’s Gibbs sampler representation?

anon92994695 · December 26, 2020, 7:56pm

Yea I was more curious about the Gibbs Sampler. I’ve seen simpler PPL models people refer to as Ridge Regression, I want to understand more what this offers.

Topic		Replies	Views
Bayesian regression with parametrized basis functions in Turing.jl Probabilistic programming question , turing , gaussian-process	7	901	April 19, 2022
[Turing.jl] How to generate samples ("observations") from a model Statistics turing	6	1457	March 2, 2023
Turing.jl for geophysical applications Probabilistic programming question , geophysics	5	714	March 29, 2024
Turing.jl - Create priors dynamically Probabilistic programming package , turing	2	349	March 1, 2024
Gibbs sampling with Gen.jl or best ppl for complicated model Probabilistic programming gen	2	1201	June 13, 2021

Ridge Regression using Turing.jl

Related topics