Hi all - I’ve been trying to use the MixtureModel
in Turing to sample from a multi-modal dataset, and generating the probability for each mode using the Dirichlet
distributions. Yet somehow I’ve been getting an BoundsError
and couldn’t see how.
Would appreciate if someone can see where I’ve gone wrong in the example below.
using Pkg
Pkg.activate(".")
using Distributions, Turing, DataFrames, Optim
# generate data
# group 1
g1 = rand(Normal(7.5, 0.2), 20)
# group 2
g2 = zeros(40)
# group 3
g3 = rand(Normal(-1, 0.2), 10)
v1 = Vector{Float64}(vcat(g1, g2, g3))
w1 = [length(g1) / length(v1)
, length(g2) / length(v1)
, length(g3) / length(v1)
]
m1 = [7, 0, -1]
# model
@model function mx1(data, mu, wgt, n_data)
# generate probability per group based on input prior
w ~ arraydist( [LogNormal((wgt[w_i]), 0.001) for w_i in 1:3]) # log to force positive..for now
dirichlet_prob ~ Dirichlet(w)
μ ~ arraydist( [Normal(mu[g_i], 0.01) for g_i in 1:3] )
for i in 1:n_data
data[i] ~ MixtureModel(
[truncated(Normal(μ[1], 0.2), 0.001, 100)
, truncated(Normal(μ[2], 0.001), -0.001, 0.001)
, truncated(Normal(μ[3], 0.2), -100, -0.001)],
[dirichlet_prob[1], dirichlet_prob[2], dirichlet_prob[3]]
)
end
end
fmla1 = mx1(v1, m1, w1, length(v1))
map_estimate = optimize(fmla1, MAP())
The above is giving me the error
BoundsError: attempt to access 2-element Vector{Float64} at index [1:3]
However, if I replace the dirichlet_prob ~ Dirichlet(w)
line with dirichlet_prob = rand(Dirichlet([wgt[1], wgt[2], wgt[3]]), 1)
, then it runs without issues.
If instead I change the rand(Dirichlet(...))
line to use the w
as weighting generated from the LogNormal: dirichlet_prob = rand(Dirichlet([w[1], w[2], w[3]]), 1)
instead of the wgt[1]
etc, then it gives me another error: DomainError with Dual{Tag{DynamicPPLTag, Float64}, Float64, ...
Any pointers would be greatly welcomed…