Need help with Variational Inference

Hi there,

I have been recently attracted to variational inference. I like the idea to link Bayesian modeling with optimization. I like its uni modal approximation to the parameters in a model (by assuming the variational posterior is a normal distribution). I have read the document on variational inference on the Turing website, Variational Inference. I try to understand the codes in AdvancedVI.jl/src at master · TuringLang/AdvancedVI.jl · GitHub, but have difficult time with them.

Here is a simple scenario. Suppose I have a data set with 100 observations and 2 variables, x and y, where x is a continuous predictor and y is a binary output. I want to use a logistic regression with a single predictor x to predict y and use variational inference to estimate the distribution of parameter z for x. The prior of z is assumed to be a standard normal distribution.

@model logistic_regression(x,y,100) = begin
    intercept ~ Normal(0,1)
    z ~ Normal(0,1)

    for i = i:100
        v = logistic(intercept + z*x[i])
        y[i] ~ Bernoulli(v)
    end    
end;   

According to the document above, we need to maximize ELBO(q) =

Σk=1 mΣi=1 n(log(p(xi,zk))/m + H(q(z))

, in order to estimate parameters in a model. I want to understand how ELBO is calculated with the above model. I have not tried to understand the optimization part yet.

Please let me know if the following is right.
log(p(xi,zk)) = log(p(xi|zk)p(zk)) = InvLogit(Intercept+zk*xi)*exp(-zk2/2)/sqrt(2π), where xi is sampled from the data set, and zk is sampled from qμ,σ = N(μ,σ2).

In Turing, is log(p(xi,zk)) calculated using the two functions in Turing.jl/VariationalInference.jl at master · TuringLang/Turing.jl · GitHub?

function make_logjoint(model::Model; weight = 1.0)
    # setup
    ctx = DynamicPPL.MiniBatchContext(
        DynamicPPL.DefaultContext(),
        weight
    )
    varinfo_init = Turing.VarInfo(model, ctx)

    function logπ(z)
        varinfo = VarInfo(varinfo_init, SampleFromUniform(), z)
        model(varinfo)

        return getlogp(varinfo)
    end

    return logπ
end

function logjoint(model::Model, varinfo, z)
    varinfo = VarInfo(varinfo, SampleFromUniform(), z)
    model(varinfo)

    return getlogp(varinfo)
end

In https://github.com/TuringLang/Turing.jl/blob/master/src/variational/objectives.jl, the objective seems to be calculated using a function elbo,

function (elbo::ELBO)(
    rng::AbstractRNG,
    alg::VariationalInference,
    q,
    model::Model,
    num_samples;
    weight = 1.0,
    kwargs...
)
    return elbo(rng, alg, q, make_logjoint(model; weight = weight), num_samples; kwargs...)
end

I do not understand how ELBO is calculated from these several lines of code .

The entropy part seems to be addressed in Turing.jl/advi.jl at master · TuringLang/Turing.jl · GitHub, right?

    if q isa TransformedDistribution
        res += entropy(q.dist)
    else
        res += entropy(q)
    end

In sum, would someone give me some instruction to read the Turing code on variational inference?

Thanks,
Chuan