Hi there,
I have been recently attracted to variational inference. I like the idea to link Bayesian modeling with optimization. I like its uni modal approximation to the parameters in a model (by assuming the variational posterior is a normal distribution). I have read the document on variational inference on the Turing website, Variational Inference. I try to understand the codes in AdvancedVI.jl/src at master · TuringLang/AdvancedVI.jl · GitHub, but have difficult time with them.
Here is a simple scenario. Suppose I have a data set with 100 observations and 2 variables, x and y, where x is a continuous predictor and y is a binary output. I want to use a logistic regression with a single predictor x to predict y and use variational inference to estimate the distribution of parameter z for x. The prior of z is assumed to be a standard normal distribution.
@model logistic_regression(x,y,100) = begin
intercept ~ Normal(0,1)
z ~ Normal(0,1)
for i = i:100
v = logistic(intercept + z*x[i])
y[i] ~ Bernoulli(v)
end
end;
According to the document above, we need to maximize ELBO(q) =
Σk=1 mΣi=1 n(log(p(xi,zk))/m + H(q(z))
, in order to estimate parameters in a model. I want to understand how ELBO is calculated with the above model. I have not tried to understand the optimization part yet.Please let me know if the following is right.
log(p(xi,zk)) = log(p(xi|zk)p(zk)) = InvLogit(Intercept+zk*xi)*exp(-zk2/2)/sqrt(2π), where xi is sampled from the data set, and zk is sampled from qμ,σ = N(μ,σ2).
In Turing, is log(p(xi,zk)) calculated using the two functions in Turing.jl/VariationalInference.jl at master · TuringLang/Turing.jl · GitHub?
function make_logjoint(model::Model; weight = 1.0)
# setup
ctx = DynamicPPL.MiniBatchContext(
DynamicPPL.DefaultContext(),
weight
)
varinfo_init = Turing.VarInfo(model, ctx)
function logπ(z)
varinfo = VarInfo(varinfo_init, SampleFromUniform(), z)
model(varinfo)
return getlogp(varinfo)
end
return logπ
end
function logjoint(model::Model, varinfo, z)
varinfo = VarInfo(varinfo, SampleFromUniform(), z)
model(varinfo)
return getlogp(varinfo)
end
In https://github.com/TuringLang/Turing.jl/blob/master/src/variational/objectives.jl, the objective seems to be calculated using a function elbo,
function (elbo::ELBO)(
rng::AbstractRNG,
alg::VariationalInference,
q,
model::Model,
num_samples;
weight = 1.0,
kwargs...
)
return elbo(rng, alg, q, make_logjoint(model; weight = weight), num_samples; kwargs...)
end
I do not understand how ELBO is calculated from these several lines of code .
The entropy part seems to be addressed in Turing.jl/advi.jl at master · TuringLang/Turing.jl · GitHub, right?
if q isa TransformedDistribution
res += entropy(q.dist)
else
res += entropy(q)
end
In sum, would someone give me some instruction to read the Turing code on variational inference?
Thanks,
Chuan