I am trying to fit a kernel mixture model in Turing. It seems to work for estimation, but gives an error when trying to use predict
. I’d be curious if anyone has a suggestion for a fix.
using Distributions, Turing
# Unidimensional Kernel Mixture model with K pre-specified components
# that cover the space from min_x to max_x
@model function KMM(x, min_x, max_x, k, σ)
N = size(x, 1)
linspan = range(min_x, stop=max_x, length=k)
kernels = map(u -> Normal(u, σ), linspan)
ω ~ Dirichlet(k, 1.0)
mixdist = MixtureModel(kernels, ω)
x ~ filldist(mixdist, N)
end
# Simulate data from a bimodal distribution
data = vcat(rand(Normal(-1, 0.5), 50), rand(Normal(1, 0.5), 50))
# Define a kernel mixture with 10 gaussian components, with means covering -2:2
model = KMM(data, -2.0, 2.0, 10, 0.5)
# Estimate weights
m1 = sample(model, NUTS(0.65), 1000)
That seems to work (although it is quite slow for a very small model). But when I want to get the posterior predictive distribution of the original data, it gives an error, saying that the method loglikelihood
does not exist for the filldist of mixtures.
pp_data = predict(KMM(Vector{Union{Missing, Float64}}(missing, length(data)), -2.0, 2.0, 10, 0.5), m1)
MethodError: no method matching loglikelihood(::Product{Continuous, MixtureModel{Univariate, Continuous, Normal{Float64}, Categorical{Float64, Vector{Float64}}}, FillArrays.Fill{MixtureModel{Univariate, Continuous, Normal{Float64}, Categorical{Float64, Vector{Float64}}}, 1, Tuple{Base.OneTo{Int64}}}}, ::Vector{Union{Missing, Float64}})
MixtureModel
from Distributions.jl
apparently doesn’t have a loglikelihood
function, only a logpdf
function, which does the job of both. I thought if I added a loglikelihood
function for the mixture, it might fix it:
loglikelihood(d::Union{UnivariateMixture, MultivariateMixture}, x) = logpdf(d, x)
But it doesn’t change the error.
Any ideas? Alternatives? I seem to remember that the MixtureModel distribution is a bit of an odd duck that often doesn’t play well with packages like Turing, but if I can avoid writing a custom logpdf function for mixtures, it’d be nice…
Also, the fact that the lack of a loglikelihood
function for a MixtureModel
is causing an issue for posterior prediction makes me suspicious of whether the original model estimates are actually correct. (Although I haven’t seen anything that looks obviously wrong.)