Hey,
I was wondering if someone could guide me (or direct me to the relevant code) on how to sample from the power posterior given a Turing model. By power posterior, I mean the posterior to modified joint : p(x|y; b) \propto p(y|x)^b p(x), where b \in [0,1].
I don’t know about Turing, but it should be relatively straightforward to build a logdensity function like this for Soss.
I like the way the power posterior connects the prior and posterior, but I don’t know any applications for sampling from it. Is this something that comes up a lot in some area?
It appears in particular for Thermodynamic Integration for which I created a small package : GitHub - theogf/ThermodynamicIntegration.jl: Thermodynamic Integration for Turing models and more.
I wanted to create some nice wrapper where the user gives a model and it automatically spits out the evidence. I would be happy to create a wrapper for Soss models as well!
Nice! In the log-density this just comes in as a multiplicative constant, so I don’t think there will be any measurable overhead (the rest of the computation is so much more). So I may just add a switch for this to logdensity
.
What’s a good descriptive name for the exponent b
as a keyword argument?
In physics they call it the coupling parameter cf : Thermodynamic integration - Wikipedia
I’m seeing it called the temperature in some places:
Is that consistent with other uses of “temperature”?
Also
I think the issue with temperature it that it could be confused with tempered posteriors, which are quite different: for : p(x|y) \propto \exp(-V(x,y)),
the tempered posterior is given by p(x|y;T)\propto \exp(-\frac{V(x,y)}{T}) where T is the temperature
The MiniBatchContext
in DynamicPPL and Turing can help you with that. The loglike_scalar
in DynamicPPL.jl/contexts.jl at master · TuringLang/DynamicPPL.jl · GitHub is b
.
Oh that does sound perfect, it obviously has a completely different target but it should work!
The “Probabilistic Integration” paper calls it the inverse temperature. Both b and 1/T appear as an exponent, so maybe the same?
Yeah, there are various names for the same, e.g. power posterior or cold posterior. I have seen the term power posterior mostly been used by statisticians.
Fair enough, I would personally disagree with the use of the term (a temperature is not restricted to [0,1] ) and there is still my argument of how confusing it can be with tempered posterior. But I think as long everything is properly described all terms are fine.
But these two are different For power posterior you only act on the likelihood, while for cold posteriors you act on the whole joint.
Cold posteriors are used in simulated annealing for instance.
Hm, I think there is a paper on BNNs which defines a cold posterior just as a power posterior.
But maybe I remember it wrong. It’s a while ago that I look at this paper.
If you mean this one : How Good is the Bayes Posterior in Deep Neural Networks Really? | Florian Wenzel they use the same definition that I gave you
I got pretty confused at first as well, that’s why I am so confident now
Cold posteriors are tempered posteriors where T<1
Ah, yes. See I remembered it wrong.
But at least I remembered correctly that it was the same as some existing concept. Haha.
So I tried the following :
function power_logjoint(model, β)
ctx = DynamicPPL.MiniBatchContext(DynamicPPL.DefaultContext(), β)
spl = DynamicPPL.SampleFromPrior()
return function f(z)
vi = DynamicPPL.VarInfo(model)
varinfo = DynamicPPL.VarInfo(vi, ctx, z)
model(varinfo, spl, ctx)
return DynamicPPL.getlogp(varinfo)
end
end
But I get the error
ERROR: LoadError: MethodError: no method matching getspace(::DynamicPPL.MiniBatchContext{DynamicPPL.DefaultContext,Float64})
Closest candidates are:
getspace(::Union{DynamicPPL.SampleFromPrior, DynamicPPL.SampleFromUniform}) at /home/theo/.julia/packages/DynamicPPL/wf0dU/src/sampler.jl:9
getspace(::GibbsConditional{S,C} where C) where S at /home/theo/.julia/packages/Turing/a9ANC/src/inference/gibbs_conditional.jl:60
getspace(::SMC{space,R} where R) where space at /home/theo/.julia/packages/Turing/a9ANC/src/inference/Inference.jl:425
...
Stacktrace:
[1] DynamicPPL.VarInfo(::DynamicPPL.VarInfo{NamedTuple{(:x,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:x,Tuple{}},Int64},Array{MvNormal{Float64,PDMats.PDiagMat{Float64,Array{Float64,1}},FillArrays.Zeros{Float64,1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:x,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}, ::DynamicPPL.MiniBatchContext{DynamicPPL.DefaultContext,Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{ThermodynamicIntegration.var"#f#33"{DynamicPPL.Model{var"#7#8",(:y,),(),(),Tuple{Array{Float64,1}},Tuple{}},DynamicPPL.MiniBatchContext{DynamicPPL.DefaultContext,Float64},DynamicPPL.SampleFromPrior},Float64},Float64,5},1}) at /home/theo/.julia/packages/DynamicPPL/wf0dU/src/varinfo.jl:115
Full stacktrace please.