Sample `arraydist` of `filldist`s in Turing

filchristou · November 14, 2022, 8:35am

Is it possible to sample many filldistss using arraydist in Turing.jl?

For example, say I have data of the following structure:

and every key-value comes from an Exponential distribution.
I know of a prior for each key-value vector.

For now I sample this model as following:

@model function arrfilldist(data, ks, nlendat=missing)
	if data === missing # for prior analysis
		data = [Vector{Int}(undef, x) for x in nlendat]
	end
	μs = calculate_prior.(ks)
	σ = 500

	priors ~ arraydist([truncated(Normal(μ,σ),0,Inf) for μ in μs])
	for d in 1:length(data)
		data[d] ~ filldist(Exponential(priors[d]), 20)
	end
end

calculate_prior(k) = k*10

and it works fine.

However I was wondering if I can avoid the for loop by using again an arraydist and if that would generally accelerate the MCMC.
So, I tried substituting the for loop with

data ~ arraydist([filldist(Exponential(priors[d]), 20) for d in 1:length(data)])

The prior analysis keeps working fine but the sampling given data throws the following error:

MethodError: no method matching loglikelihood(::DistributionsAD.VectorOfMultivariate{Distributions.Continuous, Distributions.Product{Distributions.Continuous, Distributions.Exponential{Float64}, FillArrays.Fill{Distributions.Exponential{Float64}, 1, Tuple{Base.OneTo{Int64}}}}, Vector{Distributions.Product{Distributions.Continuous, Distributions.Exponential{Float64}, FillArrays.Fill{Distributions.Exponential{Float64}, 1, Tuple{Base.OneTo{Int64}}}}}}, ::Vector{Vector{Float64}})
Closest candidates are:
loglikelihood(!Matched::EllipticalSliceSampling.ESSModel, ::Any) at ~/.julia/packages/EllipticalSliceSampling/1R0Nc/src/model.jl:23
loglikelihood(!Matched::DynamicPPL.Model, ::Any) at ~/.julia/packages/DynamicPPL/zPOYL/src/simple_varinfo.jl:603
loglikelihood(::Distributions.Distribution{Distributions.ArrayLikeVariate{N}}, !Matched::AbstractArray{<:AbstractArray{<:Real, N}}) where N at ~/.julia/packages/Distributions/0Nl1l/src/common.jl:466

Is there another way to do this or does it even make sense performance-wise ?

Reproducability

Use following code to generate data:

using DataStructures
reallinear(pars) = 10 .+ 50 .* pars
function gen_data(pars, count; rng = MersenneTwister(1))
	mypars = reallinear(pars) #secret system internal operation
	(OrderedDict(par => rand(rng, Exponential(mp), count) for (par,mp) in zip(pars,mypars)),
	mypars)
end
data, internalpars = gen_data(5:3:20, 20)

Prior analysis by doing:

arrfilldist_pr = arrfilldist(missing, 10:10:60, fill(20,6))
prch = sample(arrfilldist_pr, Prior(), 10);

Giving someting like:

Sample posterior:

arrfilldist_pt = arrfilldist(collect(values(data)), collect(keys(data)))
postch = sample(arrfilldist_pt, NUTS(), 100);

Topic		Replies	Views
Using a posterior from a previous sample as a prior Probabilistic programming turing	2	812	April 25, 2021
Some question about the structure of Turing models Probabilistic programming question , turing	25	2804	November 6, 2020
Advice speeding up a turing model Probabilistic programming question , performance , turing	7	214	January 14, 2025
Posterior predictive distributions in Turing General Usage	6	2835	September 29, 2022
Turing.jl - Create priors dynamically Probabilistic programming package , turing	2	351	March 1, 2024

Sample `arraydist` of `filldist`s in Turing

Related topics