Hi guys! I’m working with some models using a bayesian framework (Soss and Turing). I was wondering, once I have already contrasted my prior with the data and obtained the posterior distribution, how to add more data and use the already obtained posterior as prior. In other words, I would like to update my model when I have new data.
Hi @Camilo1704, welcome to the Discourse!
Abstractly this is a very standard thing to do. But in practice it often gets tricky, because the posterior is no longer an abstract function, but (typically) a sample from the distribution represented by this function.
Still, there are things we can do. Let’s look at a really simple example in Soss:
using Soss
m = @model prior,n begin
p ~ prior
x ~ Bernoulli(p) |> iid(n)
end
with some fake data
x = rand(Bernoulli(0.2),10);
n = length(x)
Here I’ve made prior
an argument to the model. In principle we can mess with the source code however we want, but this is simpler for starting out.
Say we fit some data from a uniform prior,
prior = Uniform()
post1 = dynamicHMC(m(prior=prior, n=length(x)), (x=x,));
To summarize,
julia> typeof(post1)
Array{NamedTuple{(:p,),Tuple{Float64}},1}
julia> particles(post1)
(p = 0.0882 ± 0.081,)
So now the idea is we want to feed this information back into the model as the new prior
. In principle we could treat the prior as particles and sample from the empirical distribution. But we haven’t yet connected many samplers in Soss, and the discreteness will throw us off here.
A simple thing we can still do is assumed density filtering. The idea is, we fit a distribution to the posterior and use that as the new prior. For a unit interval, a Beta distribution is (often) reasonable, so
julia> ps = particles(post1).p.particles;
julia> newprior = fit(Beta, ps)
Beta{Float64}(α=0.9935509466234749, β=10.272982073369201)
Now with some new data,
julia> newx= rand(Bernoulli(0.2),10);
julia> post2 = dynamicHMC(m(prior=newprior, n=length(newx)), (x=newx,));
and comparing
julia> particles(post1)
(p = 0.0882 ± 0.081,)
julia> particles(post2)
(p = 0.14 ± 0.075,)
We can also compare the assumed densities:
julia> newprior
Beta{Float64}(α=0.9935509466234749, β=10.272982073369201)
julia> fit(Beta, particles(post2).p.particles)
Beta{Float64}(α=2.8409226999574213, β=17.4435665618913)
There’s a lot more we could do for all of this. One route I think is interesting is fitting the transformed (to ℝⁿ) posterior to a MvNormal
, then transforming back. This would allow representation of correlations in the posterior. Or, we could even fit a normalizing flow from Bijectors.jl! Lots of possibilities, but we’d need to extend things a bit to be able to do this.
Thank you very much for the response, it was very clear
Thank you! Happy to help