DynamicHMC for Bayesian inference where product distribution is observed - Possible?

Adam_Fleischhacker · April 29, 2021, 7:32pm

Here is a generative model I am working on. Think of a game show where your winnings are determined by the product of two RV’s: winningsMultiplier(X) and maxWinnings(Y). We observe winnings w_i = x_i * y. Our prior on y/10^5 ~ Beta(2,8) corresponding to winnings up to $100k, but more likely around $20-$30k. X_i ~ Beta(2,2). Y does not change from observation to observation, but x_i and w_i do.

Here is a pic of the generative DAG:

Anyway, my attempt to code it is below. I get an error after the last line (ERROR: type UnionAll has no field w):

I need two pieces of help for which I would be very appreciative: 1) verify the logic of the implemented math is correct and 2) help me eliminate the error so I can get a posterior for y. Thanks!

## simulate some data to see if we can recover params
Random.seed!(1234)
## make y_winnings equal to 35000 for simulation
nObs = 20
dataDF = DataFrame(y_raw = rand(Beta(2,8),nObs),
            x = rand(Beta(2,2),nObs))
dataDF.y = 35000 * dataDF.y_raw
dataDF.w = dataDF.x .* dataDF.y
dataDF

## find posterior for y, the max winnings (we know it is 35k)
struct spinProblem{T <: AbstractVector}
    w::T # winnings of previous players
end
# https://en.wikipedia.org/wiki/Product_distribution

function (problem::spinProblem)(θ)
    @unpack x, y_raw = θ   # extract the parameters
    @unpack w, = spinProblem   # extract the data
    # log likelihood accumulated in llSpinProb
    # prior
    llSpinProb = logpdf(LocationScale(0, 100000, Beta(2,8)),y) 
    yDraw = 10^5 * y_raw
    for i in 1:length(w)
        x = w / y
        llSpinProb += logpdf(Beta(2,2),x) #data
    end
    return(llSpinProb)
end


p2 = spinProblem(dataDF.w)
p2((x = 0.5, y_raw = 0.25,)) # gives ERROR

cpfiffer · April 30, 2021, 3:07pm

I think this line should be

@unpack w, = problem

Looks like it’s referencing the type, not the instance of the type.

Adam_Fleischhacker · April 30, 2021, 9:40pm

This was indeed the error I was stuck on. After fixing it, I found many more errors. Those are all fixed. The fixed code is below for anyone viewing this thread. The posterior looks close to correct, but something is still off as the true value from simulated data is not recovered during a posterior predictive check. If anyone sees the math error that leads to this, please comment. I will mark this solved though as the initial Julia issue I was having was graciously solved by @cpfiffer . Thank you for that!!!

using Revise, TransformVariables, LogDensityProblems, DynamicHMC,  DynamicHMC.Diagnostics, Parameters, Statistics, Random, DataFrames, Distributions, Gadfly

## simulate some data to see if we can recover params
Random.seed!(234)
## make y_winnings equal to 35000 for simulation
nObs = 2000
dataDF = DataFrame(x = rand(Beta(2,2),nObs))
dataDF.w = dataDF.x .* 35000
dataDF

## find posterior for y, the max winnings (we know it is 35k)
struct spinProblem{T <: AbstractVector}
    w::T # winnings of previous players
end
# https://en.wikipedia.org/wiki/Product_distribution

function (problem::spinProblem)(θ)
    @unpack y_raw = θ   # extract the parameters
    @unpack w = problem   # extract the data

    # log likelihood accumulated in llSpinProb
    llSpinProb = logpdf(Beta(2,8),y_raw)
    yDraw = y_raw * 10^5

    # likelihood
    for i in 1:length(w)
        x_i = min(w[i] / yDraw,0.999) 
        # use min below to restrict data to domain
        # otherwise get domain error
        llSpinProb += logpdf(Beta(2,2),min(x_i,0.999)) #data
    end

    return(llSpinProb)
end


p2 = spinProblem(dataDF.w)
p2((y_raw = 0.25,)) # infeasible
p2((y_raw = 0.40,)) # feasible

trans2 = as((y_raw = as𝕀,))  ## xtra comma to make it a named tuple
P2 = TransformedLogDensity(trans2, p2)
∇P2 = ADgradient(:ForwardDiff, P2)

results = mcmc_with_warmup(Random.GLOBAL_RNG, ∇P2, 8000; reporter = NoProgressReport())

## transform not defined
posterior = TransformVariables.transform.(trans2, results.chain)
posterior_y_raw = first.(posterior)
posterior_y = posterior_y_raw * 10^5 ##estimates seem high - math error?

plot(x = posterior_y, Geom.density)  ## histogram
plot(x = 1:length(posterior_y),
        y = posterior_y, Geom.line)

# posterior predictive check
postPredDF = DataFrame(draw = dataDF.w,
                type = "Observed",
                group = "0")

postSimDF = DataFrame()
for i in 1:20
    tempDF = DataFrame(
                draw = rand(posterior_y) .* rand(Beta(2,2),nObs),
                type = "Simulated",
                group = string(i))
    postSimDF = vcat(postSimDF,tempDF)
end

obsLayer = layer(postPredDF, x = :draw, color = [colorant"cadetblue"], style(line_width=6pt), Geom.density, order = 10)

plot(postSimDF, x = :draw, color = :group, style(line_width=1pt), Geom.density, 
        obsLayer)
## estimates look biased, not sure why?  I would think the prior should be made negligible in relation to the likelihood.  See any errors?

Here is the posterior distribution (unfortunately, 35,000 is the true answer used to simulate the data):

Adam_Fleischhacker · April 30, 2021, 9:41pm

The posterior predictive check looks off as well. Something seems amiss

stephenll · April 30, 2021, 11:39pm

Just a quick observation… The mean of Beta(2,2) is .5 Not sure if that is what you intended when multiplying by 35k.

ParadaCarleton · May 1, 2021, 3:40am

First, the prior won’t be negligible with a sample size of 20; a bigger sample might help. Second, can we see some diagnostic plots? (Traceplot, divergences, energy)

Adam_Fleischhacker · May 1, 2021, 11:41am

Multiplying by 35,000 is to create some sample data where the true “max winnings” (Y) is 35,000. I then want to use the model to recover this $35,000 number. Unfortunately, the model returns that it is confident the true value for Y is closer to $40,000. Being unable to recover this $35,000 number makes me very skeptical of my model, but I do not see the error leading to this over-estimate.

Adam_Fleischhacker · May 1, 2021, 11:55am

The second block of code, for which the posterior predictive check plot is shown above, uses an observed sample of 2,000 observations. So the prior should be negligible at this point. Here is a traceplot showing the space is well-explored (code available when running the second code block I posted), just not at the “true” value of $35k. My guess is I am not handling the loglikelihood properly for draws of y_raw that are completely infeasible given the data. For example, a draw of y_raw of 0.25 corresponds to max winnings(Y) of $25,000. Yet, if we observed someone has won $30k, then we know $25k is completely infeasible - hence, the log density is negative infinity,. Investigations to continue

ParadaCarleton · May 2, 2021, 12:03am

That traceplot looks possibly ok, but not great either; what about other diagnostic plots? Here’s a list of the diagnostic plots I usually use to check:

    # Diagnostics
    plot_trace(trace; var_names=params, compact=True)
    plot_mcse(trace; var_names=params)
    plot_ess(trace; var_names=params, kind="local")
    plot_ess(trace, var_names=params, kind="quantile")
    plot_ess(trace, var_names=params, kind="evolution")
    plot_rank(trace, var_names=params)
    plot_energy(trace)

Here’s an example to show that sometimes, your trace can look great even if your inferences are pretty awful – the energy transition plot is clearly much more sharply peaked than the marginal energy plot:

Meanwhile, an ESS plot shows exactly where the problem is – the centered parametrization for my data is making sampling of standard deviations extremely difficult. At some points, my effective sample size is even trending downwards as the value gets stuck at a low value for the standard deviation.

Adam_Fleischhacker · May 3, 2021, 11:50pm

Found the error. Missing the Jacobian for the likelihood. Found this post helpful in regards to knowing when to use a Jacobian and when not to: Demystifying Jacobian adjustment for transformed parameters in Stan

Here is the block of code that adds the Jacobian:

function (problem::spinProblem)(θ)
    @unpack y_raw = θ   # extract the parameters
    @unpack w = problem   # extract the data

    # log likelihood accumulated in llSpinProb
    # prior on y_raw
    llSpinProb = logpdf(Beta(2,8),y_raw)
    yDraw = y_raw * 10^5

    # likelihood
    for i in 1:length(w)
        x_i = min(w[i] / yDraw,0.999) 
        # use min below to restrict data to domain
        # otherwise get domain error
        llSpinProb += logpdf(Beta(2,2),min(x_i,0.999)) #data
        llSpinProb += log(1/yDraw) # Jacobian
    end

    return(llSpinProb)
end

Now the posterior predictive check looks much better:

and the trace plot is centered in the right spot (i.e. $35K):

Topic		Replies	Views
Question on Bayesian Inference with Julia, using Turing and/or AdvancedHMC and/or Distributions Statistics question	11	1918	March 11, 2020
ANN: DynamicHMC 2.0 Package Announcements announcement	57	4730	December 20, 2019
Custom likelihoods in Turing.jl General Usage	15	3715	October 26, 2018
DynamicHMC: multivariate normal loglikelihood? General Usage	10	952	May 24, 2019
Using TransformVariables.jl to Numerically Handle Jacobian Adjustment of Simple Problem Probabilistic Programming dynamichmc	1	383	May 7, 2021

DynamicHMC for Bayesian inference where product distribution is observed - Possible?

Related topics