Sampling from Delta in Turing

trahflow · May 1, 2020, 8:43pm

Hi PPL people

I was trying to build a very simple model of censored sales, and was naively trying:

@model mymodel(sales) = begin
    lam ~ Gamma(2, 3)
    stocks ~ Categorical(ones(20) ./ 20)
    demand ~ Poisson(lam)
    sales ~ min(demand, stocks)
end

Main point of the model is, that I only observe sales, but want to do inference on lam and stocks.

Turing tells me that the last line is wrong, because I can only sample from Distributions (which of course makes sense).
I would have implemented the last line as sales ~ Delta(min(demand, stocks)), but there’s no such thing as a Delta distribution in Distributions.jl
Is there a way to express this in Turing?

Thanks for any hints!

cpfiffer · May 1, 2020, 11:50pm

Not really a Turing issue here, more a Distributions one. There appears to be an implementation of a Dirac distribution here but it seems to have been caught up.

cpfiffer · May 1, 2020, 11:51pm

Update: I found this, which suggests using a Normal with variance zero, so try

@model mymodel(sales) = begin
    lam ~ Gamma(2, 3)
    stocks ~ Categorical(ones(20) ./ 20)
    demand ~ Poisson(lam)
    sales ~ Normal(min(demand, stocks), 0)
end

trappmartin · May 2, 2020, 7:48am

I’m not sure I understand. What do you assume to be your observation model? If I’m not mistaken your observations are a deterministic transformation of your parameters. Is that correct?

I don’t think you are going to get reasonable inference results with a dirac delta.

trappmartin · May 2, 2020, 7:51am

This will likely not work as a Normal distribution with zero variance has infinite density. And so the joint log density will always be infinite, and HMC will not work.

trahflow · May 2, 2020, 9:03am

Hi and thanks for all the answers so far!

If I’m not mistaken your observations are a deterministic transformation of your parameters. Is that correct?

Yes that’s correct, and I should have phrased my question accordingly:
How can I observe / condition on deterministic transformations of my random variables?

And I should have also mentioned:
I am not necessarily tied to HMC. VI would also work for me.
Though with this particular model already the forward pass fails:

prior = mymodel(missing)
prior()

raises ArgumentError: Right-hand side of a ~ must be subtype of Distribution or a vector of Distributions

mohamed82008 · May 2, 2020, 12:26pm

It seems that your random variables stocks and demand are discrete and so are your sales. So maybe use a Categorical distribution for sales with a probability of 1 for one of the categories and 0 for all the others. Since you have discrete random variables, you should consider MH, PG or Gibbs, not pure HMC.

torfjelde · May 2, 2020, 12:45pm

A couple of things:

This particular case can be reduced to stocks ~ truncated(Poisson(lam), 0, 20), right? A uniform categorical doesn’t add anything.
To be able to use Turing (or rather, any MCMC algorithm) for sampling a requirement is that you can evaluate the logpdf pointwise, and so there won’t be a general way to handle what you’re suggesting. For that you have to look at something like approximate Bayesian inference (ABC). Hence, something like a Delta for continuous won’t work. For transformations for continuous variables, you have to look at Bijectors.jl which allows you to transform variables but using a restricted family of functions for which everything is well-defined.
For even more general (but discrete) cases than what I suggest in (1), you can probably get quite far by introducing a empirical distribution, i.e. something like

julia> using Distributions

julia> struct EmpiricalPDF{T} <: DiscreteUnivariateDistribution
           xs::Vector{T}
       end

julia> function Distributions.logpdf(d::EmpiricalPDF, x::Int)
           return mean(x .== d.xs) # (inefficiently) count now many times the value occured
       end

julia> stocks = rand(Categorical(fill(1, 20) ./ 20), 100);

julia> ys = rand(Poisson(1.), 100);

julia> dist = EmpiricalPDF(min.(xs, ys))
EmpiricalPDF{Int64}(
xs: [1, 3, 1, 2, 1, 0, 1, 0, 2, 1  …  0, 1, 1, 0, 2, 2, 2, 1, 1, 1]
)

julia> logpdf(dist, 1)
0.39

julia> logpdf(dist, 2)
0.18

julia> logpdf(dist, 3)
0.06

You’d have to reconstruct the distribution for every lam though, i.e. you’re Turing model would be something like

@model mymodel(sales) = begin
    lam ~ Gamma(2, 3)
    stocks ~ Categorical(ones(20) ./ 20)
    demand ~ Poisson(lam)

    demand_ = rand(Poisson(lam), 100);
    stocks_ ~ Categorical(ones(20) ./ 20);

    sales ~ EmpiricalPDF(min(demand_, stocks_)) # <= will end up calling `logpdf(EmpiricalPDF, sales)` which we can define
end

And this you can sample from using something like MH or PG:) This is quite hacky though, so a better way would be to derive the closed form distribution that results from taking min of the variables.

trahflow · May 2, 2020, 1:55pm

Wow, lots of useful answers. Thank you all for your time! I’ll try to answer in one post:

That does the trick for the forward pass (prior predictive), and even SMC does give a result even though I don’t know yet whether the resulting samples I obtain make sense (or could ever, based on what @trappmartin said).

Cool! I think that does exactly what I want! Thanks for the suggestion!

I think that would not be equivalent. In my model, stocks should be an unobserved discrete latent variable over which I want to marginalize. The maximum value of 20 is kind of an arbitrary choice that ideally should not have an impact on the model as long as 20 >> demand.
Truncating demand at 20 would be equivalent to observing stocks = 20.

I can’t really point my finger on it, but I think this requirement should still be satisfied by my particular model.

I now realize that a Delta distribution is not really the right thing for what I want here, since that is usually understood to live on a continuous domain, and my problem really boils down to observing on a deterministic transformation of discrete random variables.
I think what @mohamed82008 suggested is what I was looking for.

Even though this is OT: I’m kind of blown away by how engaged this community is! Thank you all!

–EDIT–

just posting the model with the suggestion from @mohamed82008 and for more than 1 observation:

@model mymodel(sales_p1, N) = begin
    if sales_p1 === missing
        sales_p1 = Vector{Float64}(undef, N)
    end
    
    lam ~ Gamma(2, 3)
    for i in 1:length(sales_p1)
        stocks_p1[i] ~ Categorical(ones(21) ./ 21)  # stocks_p1 = stocks + 1
        demand[i] ~ Poisson(lam)
        p = zeros(21)
        p[min(demand[i] + 1, stocks_p1[i])] = 1.
        sales_p1[i] ~ Categorical(p)  # sales_p1 = sales + 1
    end
    return lam, stocks_p1, demand, sales_p1
end

histograms of the forward pass of the model:

histogram(demand, alpha=0.5, bins=0:20, label="demand")
histogram!(sales_p1 .- 1, alpha=0.5, bins=0:20, label="sales")
histogram!(stocks_p1 .- 1, alpha=0.5, bins=0:20, label="stocks")
vline!([lam], label="lam")

hist

Scatter plot sales vs stocks (with some random jitter):

scatter(stocks_p1 .- 1 .+ 0.1 * randn(length(stocks_p1)), sales_p1 .- 1 .+ 0.1 * randn(length(stocks_p1)), markersize=1.5)
xlabel!("stocks")
ylabel!("sales")

scatter

Topic		Replies	Views
A simple model which I can't get Turing to fit to simulated data Probabilistic programming	1	440	May 24, 2022
Puzzling on a product likelihood using Distributions and Turing Probabilistic programming question , turing , distributions	2	1548	January 21, 2021
Using Turing for MCMC estimation Performance turing , probability	4	459	July 13, 2022
Parameter estimation problem in Turing Probabilistic programming turing	11	505	July 14, 2022
Turing sampling from priors New to Julia turing	1	382	May 23, 2022

Sampling from Delta in Turing

Related topics