Custom posterior distribution in Turing

Hello,
I have the following simple Turing model:

@model function opt(y, λ, sa, sb)
    α ~ Exponential(λ)
    β ~ Exponential(λ)
end

I would like to sample from a posterior custom distribution similar to:

  (beta(α, β) ^ - length(y)) * exp(- α * (λ - sa)) * exp(- β * (λ -sb))

(where beta is not the Beta distribution but the beta function in SpecialFunctions.jl)
I have tried to use the for loop on the posterior function:

for i in 1:length(y)
y[i] ~ (beta(α, β) ^ - length(y)) * exp(- α * (λ - sa)) * exp(- β * (λ -sb))
end

But doesn’t work because the function is not of type Distribution. I think I should define a custom posterior distribution.
I have alrady read the documentation at Advanced Usage . But the explained example of custom distribution is very different from the distribution I would like to obtain. Could you suggest me any {solution/something to read/rtfm} to define a custom distribution similar to that function? In general I would like to undestrand how define a custom distribution having a statistical model definition.

Thank you.

1 Like

I think you may have your terminology a bit confused–the posterior is what you estimate by running MCMC on a specific instance of your model. In that sense, the Turing model is the “custom posterior.” I think you’re actually looking for a way to define a custom distribution to use for your data likelihood. Am I interpreting it correctly that the formula you included above is a probability density function for y, conditional on α, β, λ, sa, and sb?

The Turing docs you linked to use a very simple distribution as an example, but show pretty much what you need to do. For your case, if I’m understanding it correctly, the implementation would look something like:

using Distributions
struct MyDistribution{T} <: ContinuousUnivariateDistribution
    α::T
    β::T
    sa::T
    sb::T
end

function Distributions.logpdf(d::MyDistribution, y)
    α, β, λ, sa, sb = d.α, d.β, d.λ, d.sa, d.sb
   # code to calculate log-PDF of `y`, given `α`, `β`, `λ`, `sa`, and `sb`
end

function Distributions.rand(rng::AbstractRNG, d::MyDistribution)
    # code to draw random sample from `d`, given `α`, `β`, `λ`, `sa`, and `sb`
end
1 Like

Hi,

think you’re actually looking for a way to define a custom distribution to use for your data likelihood .

Yes, it’s correct.

that the formula you included above is a probability density function for y , conditional on α , β , λ , sa , and sb

In this case the formula is p(α,β | y).
λ, sa and sb are numbers, not distributions. Sa, and sb can be built using y.

I tried to define logpdf() using the formula above but i think it’s not correct.

function Distributions.logpdf(d::MyDist, y::Real)
    α, β, λ = d.α, d.β, d.λ
    # code to calculate log-PDF of `y`, given `α`, `β`, `λ`, `sa`, and `sb`
    sa = sum([log(i) for i in y])
    sb = sum([log(1 - i) for i in y])
    return (beta(α, β) ^ - length(y)) * exp(- α * (λ - sa)) * exp(- β * (λ - sb))
end

@model function mutation(y, λ)
    α ~ Exponential(λ)
    β ~ Exponential(λ)
    for i in 1 : length(y)
        y[i] ~ MyDist(α, β, λ)
    end
end

When i try to plot the sample I always obtain this plot:

I also tried to insert trivial istructions in logpdf and randr like:

return 10

but the plot is always the same.
I have also set function Distributions.logpdf(d::MyDist, y::AbstractVector{<:Real})

The samples are generated with Gibbs(MH(:α), MH(:β))
Thank you.

Okay, a couple of follow up questions. It looks like your custom distribution actually a multivariate distribution, is that the case? It kind of looks like that, since the formula depends on all the elements of y…what’s the name of the distribution? What kind of process is it trying to capture?

Also, is the formula you gave for the PDF, or the log(PDF)? If it’s actually for the PDF, your custom logpdf function will probably be returning a tiny number on the wrong measurement scale, which would explain why your MCMC chains just look like samples from the Exponential priors.

Hi,

The function I wrote is p(α,β | y) (what I call “posterior”) but I think I should use p(y|α,β) (the likelihood, as you wrote).

for i in 1:length(y)
    y[i] ~ p(y|α,β)
end

Where my p(y|α,β) is a Beta distribution. Is it the correct way to sample?

The posterior distribution, p(α, β | y), is what you estimate using Bayesian inference (typically MCMC). If you’ve already got a closed-form expression for p(α, β | y), you don’t need to use Turing at all, you can just plug the values of y, α, and β in and calculate the posterior directly.

If I’m understanding correctly, though, you have some data y which are i.i.d. and restricted to the interval between 0 and 1, which you’re modeling using a beta distribution. Is that right? If so, the following does what you want “out of the box:”

@model function mutation(y, λ)
    α ~ Exponential(λ)
    β ~ Exponential(λ)
    for i in 1 : length(y)
        y[i] ~ Beta(α, β)
    end
end