What kind of distribution should I use for binary variable and others

Hi, Guys.

I have a question about building linear model

I’ve read some examples about linear model with continuos predictors.
But I don’t know how to set prior distributions on the binary(boolean) or categorcal or ordinal variables.

Let’s say I have dataframe x_train and y_train(IQ score).
x_train consists of 4 variables age, sex, self-esteem, favorite-fruit.

age is continuous variable, sex is binary, self-esteem is ordinal(1 to 5), fruit is categorical variable(1 to 6).

How should I build a linear model with non informative or weak priors?

@model function lin_reg(x, y)
	#priors
	α ~ Normal(mean(y), 10) # intercept
    σ ~ Exponential(1) #sigma
	beta1 ~ Normal(0, 10) #age
    beta2 ~ Bernoulli(0.5) # sex
    beta3 ~ DiscreteUniform(1, 5) # self-esteem
	beta4 ~ DiscreteUniform(1, 6) # fruit

	μ = α .+ beta1 * x[:, :age] .+ beta2 * x[:, :sex] .+ beta3 * x[:, :esteem] .+ beta4 * x[:, :fruit]
	
	y ~ MvNormal(μ, σ)
end

Thank you!

If x is binary or ordinal the effect of these on y (i.e. the betas) is often not binary or ordinal. Often the betas are continues and may have priors set accordingly. Try beta~uniform(-a,a) with a large a, or beta~normal(0,sigma) with a large sigma, or likewise and see how it goes.

2 Likes