Obtaining the posterior distribution of parameters in Turing using the MCMC method

Hello everyone. I am a student majoring in physics with no solid background in statistics. I recently tried to use Turing, but I found that there are some hard-to-understand problems that make it difficult for me to use Turing in my work. To express my problem more clearly, I will show some code below.

My ultimate goal is to obtain the posterior distributions of the parameters with MCMC method in some physical models. Let’s say there is a machine which can measure the time (t), observational value (obsVal), and observational uncertainty (uncertainty) at each observation. The model is quite simple (just as an example):

begin
	a = 1.5
	b = 3.2
	c = 0.08
	f(x) = (a * x + sinpi(x / c)) * cospi(b)
end

where a, b and c are three parameters and f(x) is the model. The ground-truth values for the three parameters are given above. Then we measure the machine for 30 times, and we have 30 values for time, trueVals, obsVals, and uncertainties:

function observations(time::Array)
	uncertainties = 2 * rand(Float64, length(time)) # Uncertainties ∈ [0, 2]
	errors = rand.(Normal.(0, uncertainties)) # Assume error ~ Normal(0, uncertainty)
	trueVals = f.(time)
	obsVals = trueVals .+ errors
	return trueVals, obsVals, uncertainties
end

The plot of the model is somehow like

begin
	p = plot(xlabel="Time", ylabel="Value")

	time = 20 * rand(Float64, 30) # Observate 30 times, time ∈ [0, 20]
	trueVals, obsVals, uncertainties = observations(time)
	plot!(p, time, obsVals, yerror=uncertainties, seriestype=:scatter, label="Obs")
	plot!(p, time, trueVals, seriestype=:scatter, label="True")
end

Model

Now I want to define a Turing model, and build MCMC chains to obtain the posterior distributions of a, b, and c:

@model function MyModel(obsVals, uncertainties)
	# Prior
	a ~ Uniform(1, 2)
	b ~ Uniform(2.5, 3.5)
	c ~ Normal(0, 1)

	# Codes below may be wrong, I'm not sure...
	obsVals .~ Normal.(f.(obsVals), uncertainties)
end

My questions are summarized as below:

  1. How should I define a Turing model, which can take obsVals, uncertainties, and the custom defined model f(x) into consideration simultaneously, then build and run MCMC chains with suitable algorithm (like HMC) to get the posterior distributions of parameters?
  2. Previously I used emcee to do similar tasks in Python. And it is the programmer’s obligation to do some burdensome tasks like calculating likelihood and checking marginal conditions (e.g., see functions log_prior and log_probability in Fitting a model to data — emcee). Nevertheless, similar functions are not found in Turing. Does this mean that Turing automatically take these tasks?

Thanks for any helpful suggestions! Really rudimentary questions… :slight_smile:

I’m confused here why f is predicting obsVals from the obsVals rather than from some covariate values. Like I’d expect your observations are predicted from the time and some parameters or something similar.

Beyond that, you have the right basic idea. Once you have your model defined you instantiate it with the particular data you’ve collected, then call sample on it. The Turing docs do a good job of giving those examples

Thanks for your reply. Yes, this is exactly what I feel confusing. The thing here is, at any time t I can obtain a measurement obsVal ± uncetainty. Then how can I express this function with the randomly sampled parameters a, b, and c in the Turing model, then measure the corresponding likelihood?

I rewrite the function as

f(time, a, b, c) = (a * x + sinpi(x / c)) * cospi(b)

and I define the Turing model as

@model function MyModel(obsVals, time, uncertainties)
	# Prior
	a ~ Uniform(1, 2)
	b ~ Uniform(2.5, 3.5)
	c ~ Normal(0, 1)

	obsVals .~ Normal.(f.(time, a, b, c), uncertainties)
end

It seems like the model could (literally) work now. Hope this code is correct but I still cannot feel certain on its correctness… Maybe I should have a try on some more sophisticated models to check whether I get the right idea :neutral_face: Anyway, thank you again!

This looks more likely to be right. Give it a try and call sample see if you get something reasonable