 # A simple MCMC code for beginners for learning

If you try to understand how Metropolis-Hastings MCMC works in producing a posterior for a parameter. You can read various website like this

But unless you understand R very well, it would make your head spin. But you can understand it easily with this simple julia code. You would need of course to understand some simple Bayesian theory beforehand.

``````using Distributions
using StatsPlots

# We perform 9 coin tosses and obtain 6 heads and 3 tails.
# What is the probability density function for obtaining a Head?
tails = 3

# define the number of points used (Not strictly required for MCMC)
numofpoints = 1001

# define probability grid from 0.0 to 1.0 with 1001 points
# (Not strictly required for MCMC)
p_grid = [ k  for k in range(0,stop=1,length=numofpoints) ]
p_grid_lb = p_grid      # lower bound ( required for MCMC )
p_grid_ub = p_grid[end]    # upper bound ( required for MCMC )

# define our prior which is a uniform prior ( required for MCMC )
prior = [ 1.0  for _ in 1:numofpoints ]

# define the binomial distribution
function dbinom(n::Integer,size::Integer,p::Real)
binomial(size, n) * p^n * (1.0-p)^(size-n)
end

# define the number of samples used in MCMC
# And create array p with that many elements (filled with 0.0)
n_samples = 1_000_000
p = zeros(n_samples)
# Set the starting value to a random number between 0.0 and 1.0
# strictly speaking it should be between p_grid_lb and p_grid_ub
p = rand()
JumpingWidth = 0.1
for i in 2:n_samples
# p_new has a value around the vicinity of p[i-1]
p_new = rand(  Normal( p[i-1] , JumpingWidth )  )
# make sure p_new is between 0.0 and 1.0
if p_new < p_grid_lb      # if p_new < 0.0
p_new = p_grid_lb + abs( p_new - p_grid_lb )   # abs( p_new )
elseif p_new > p_grid_ub  # if p_new > 1.0
p_new = p_grid_ub - abs( p_new - p_grid_ub )   # 2.0 - p_new
end
# Calc the probability of obtaining Heads and Tails
# assuming that probability = p[i-1] and probability = p_new
" q0 is posterior0 = likelihood0 * prior0        "
" q1 is posterior1 = likelihood1 * prior1        "
" In our cases, the prior is always 1.0          "
" or calc prior as prior[1+ceil(p*(numofpoints-1))] "
" or calc prior as Distributions.pdf(Distributions.Beta(1, 1),p) "
" where p is the probability "
" and the likelihood is dbinom( heads , heads+tails , probability ) "
# The value of p[i] depends on whether the
# random number is less than q1/q0
p[i] = rand() < q1/q0 ? p_new : p[i-1]
end

# Next we need to display the histogram of the array p
# this is the shape of the posterior produced by MCMC
histogram(p,legend=false) |> display
``````

10 Likes

Exactly, I have used R for many years but still find the Julia code in StatisticalRethinking.jl more straightforward to understand

1 Like

Here is the equivalent code for Turing

``````# Import libraries.
using Turing
using Distributions, StatsPlots

# 9 coin tosses with 6 heads and 3 tails
data = [1, 0, 1, 0, 1, 1, 0, 1, 1]

# Now we use Turing to find the unknown parameter p
# Declare our Turing model.
@model coinflip(data) = begin
# Our prior belief about the unknown parameter p which is
# the probability of heads in a coin.
p ~ Beta(1, 1)

# The number of observations.
N = length(data)
for n in 1:N
# Heads or tails of a coin are drawn using
# a process which have a Bernoulli distribution
data[n] ~ Bernoulli(p)
end
end;

# Settings of the Hamiltonian Monte Carlo (HMC) sampler.
# If you never used HMC before then do not change the value
# of ϵ and τ from 0.05 and 10
samples = 1_000
ϵ = 0.05 # leapfrog step size
τ = 10   # leapfrog step numbers

# Start sampling.
chain = sample(coinflip(data), HMC(ϵ, τ), samples);

# Summarise results
println(  describe(chain)  )

# Plot a summary of the sampling process for the parameter p, i.e. the probability of heads in a coin.
display(  histogram(chain[:p])  )

``````
3 Likes

And now may I present the source code for MCMC for estimating TWO parameters!!!

``````using Distributions
using StatsPlots

# Here are the weights of the students (in kilograms) in a classrom
# Assuming the weight of the students are generated by a biological
# process with a normal distribution with mean μ and standard dev σ
# find the posterior distribution of those two parameters
data = [ 55.8, 64.1, 61.2, 55.2, 58.0, 69.8, 53.2, 70.0, 63.3, 63.5 ]

# define our prior for μ as a normal centered on 60 ( required for MCMC )
prior_μ_distr = Distributions.Normal(60, 10)
prior_μ(p) = Distributions.pdf(prior_μ_distr,p)
μ_lb = 30
μ_ub = 90

# define our prior for σ as a exponential with mean of 20 ( required for MCMC )
prior_σ_distr = Distributions.Exponential(20)
prior_σ(p) = Distributions.pdf(prior_σ_distr,p)
σ_lb = 0
σ_ub = 50

# define the number of samples used in MCMC
# And create array p with that many elements (filled with 0.0)
n_samples = 1_000_000
p_μ = zeros(n_samples)
p_σ = zeros(n_samples)
# Set the starting value to a random number between lb and ub
p_μ = rand(prior_μ_distr)
while p_μ < μ_lb || p_μ > μ_ub
p_μ = rand(prior_μ_distr)
end

p_σ = rand(prior_σ_distr)
while p_σ < σ_lb || p_σ > σ_ub
p_σ = rand(prior_σ_distr)
end

# The likelihood function which returns the probability,
# given a fixed mu and a fixed sigma, of the sequence of datapoints
function likelihood(data::Vector,mu::Real,sigma::Real)
numofdatapoints = length(data)
result=1.0
for k = 1:numofdatapoints
result *= pdf(Normal(mu, sigma),data[k])
end
return result
end

JumpingWidth = 0.1
for i in 2:n_samples
# p_new has a value around the vicinity of p[i-1]
p_μ_new = rand(  Normal( p_μ[i-1] , JumpingWidth*(μ_ub-μ_lb) )  )
# make sure p_μ_new is between μ_lb and μ_ub
if p_μ_new < μ_lb
p_μ_new = μ_lb + abs( p_μ_new - μ_lb )
elseif p_μ_new > μ_ub
p_μ_new = μ_ub - abs( p_μ_new - μ_ub )
end

p_σ_new = rand(  Normal( p_σ[i-1] , JumpingWidth*(σ_ub-σ_lb) )  )
# make sure p_σ_new is between σ_lb  and σ_ub
if p_σ_new < σ_lb
p_σ_new = σ_lb + abs( p_σ_new - σ_lb )
elseif p_σ_new > σ_ub
p_σ_new = σ_ub - abs( p_σ_new - σ_ub )
end
" q0 is posterior0 = likelihood0 * prior0        "
" q1 is posterior1 = likelihood1 * prior1        "
priorμ0 = prior_μ(p_μ[i-1])
priorσ0 = prior_σ(p_σ[i-1])
priorμ1 = prior_μ(p_μ_new)
priorσ1 = prior_σ(p_σ_new)

q0 = likelihood(data, p_μ[i-1], p_σ[i-1]) * priorμ0 * priorσ0
q1 = likelihood(data, p_μ_new, p_σ_new) * priorμ1 * priorσ1
# The value of p[i] depends on whether the
# random number is less than q1/q0
if rand() < q1/q0
p_μ[i] = p_μ_new
p_σ[i] = p_σ_new
else
p_μ[i] = p_μ[i-1]
p_σ[i] = p_σ[i-1]
end

end

# Next we need to display the histogram of the array p
# this is the shape of the posterior produced by MCMC
histogram(p_μ,legend=false,title="Posterior of mu") |> display
histogram(p_σ,legend=false,title="Posterior of sigma") |> display

``````
1 Like

Because we are doing lots and lots and lots of multiplication, even Float64 can run out of room. Therefore most MCMC programs uses Logarithm and this ensures that Float64 will not hit its limitation.

Source code below uses Logarithm

``````using Distributions
using StatsPlots

# Here are the weights of the students (in kilograms) in a classrom
# Assuming the weight of the students are generated by a biological
# process with a normal distribution with mean μ and standard dev σ
# find the posterior distribution of those two parameters
data = [ 55.8, 64.1, 61.2, 55.2, 58.0, 69.8, 53.2, 70.0, 63.3, 63.5 ]

# define our prior for μ as a normal centered on 60 ( required for MCMC )
prior_μ_distr = Distributions.Normal(60, 10)
prior_μ(p) = Distributions.pdf(prior_μ_distr,p)
μ_lb = 30
μ_ub = 90

# define our prior for σ as a exponential with mean of 20 ( required for MCMC )
prior_σ_distr = Distributions.Exponential(20)
prior_σ(p) = Distributions.pdf(prior_σ_distr,p)
σ_lb = 0
σ_ub = 50

# define the number of samples used in MCMC
# And create array p with that many elements (filled with 0.0)
n_samples = 1_000_000
p_μ = zeros(n_samples)
p_σ = zeros(n_samples)
# Set the starting value to a random number between lb and ub
p_μ = rand(prior_μ_distr)
while p_μ < μ_lb || p_μ > μ_ub
p_μ = rand(prior_μ_distr)
end

p_σ = rand(prior_σ_distr)
while p_σ < σ_lb || p_σ > σ_ub
p_σ = rand(prior_σ_distr)
end

log10(x) = log(10,x)

# The likelihood function which returns the probability,
# given a fixed mu and a fixed sigma, of the sequence of datapoints
function loglikelihood(data::Vector,mu::Real,sigma::Real)
numofdatapoints = length(data)
result=0.0
for k = 1:numofdatapoints
result += log10(  pdf(Normal(mu, sigma),data[k])  )
end
return result
end

JumpingWidth = 0.1
for i in 2:n_samples
# p_new has a value around the vicinity of p[i-1]
p_μ_new = rand(  Normal( p_μ[i-1] , JumpingWidth*(μ_ub-μ_lb) )  )
# make sure p_μ_new is between μ_lb and μ_ub
if p_μ_new < μ_lb
p_μ_new = μ_lb + abs( p_μ_new - μ_lb )
elseif p_μ_new > μ_ub
p_μ_new = μ_ub - abs( p_μ_new - μ_ub )
end

p_σ_new = rand(  Normal( p_σ[i-1] , JumpingWidth*(σ_ub-σ_lb) )  )
# make sure p_σ_new is between σ_lb  and σ_ub
if p_σ_new < σ_lb
p_σ_new = σ_lb + abs( p_σ_new - σ_lb )
elseif p_σ_new > σ_ub
p_σ_new = σ_ub - abs( p_σ_new - σ_ub )
end
" q0 is posterior0 = likelihood0 * prior0        "
" q1 is posterior1 = likelihood1 * prior1        "
logpriorμ0 = log10(  prior_μ(p_μ[i-1])  )
logpriorσ0 = log10(  prior_σ(p_σ[i-1])  )
logpriorμ1 = log10(  prior_μ(p_μ_new)   )
logpriorσ1 = log10(  prior_σ(p_σ_new)   )

q0 = loglikelihood(data, p_μ[i-1], p_σ[i-1]) + logpriorμ0 + logpriorσ0
q1 = loglikelihood(data, p_μ_new, p_σ_new) + logpriorμ1 + logpriorσ1
# The value of p[i] depends on whether the
# random number is less than q1/q0
if log10(rand()) < q1-q0
p_μ[i] = p_μ_new
p_σ[i] = p_σ_new
else
p_μ[i] = p_μ[i-1]
p_σ[i] = p_σ[i-1]
end

end

# Next we need to display the histogram of the array p
# this is the shape of the posterior produced by MCMC
histogram(p_μ,legend=false,title="Posterior of mu") |> display
histogram(p_σ,legend=false,title="Posterior of sigma") |> display

``````

Here is the source code for estimating 3 parameters. I am not writing any more source code because writing it is tiring and tedious. I am only doing it as a proof of concept and to help others figure out how the magic of MCMC is performed behind the scenes.

In real life, please use Turing or STAN

``````using Distributions
using StatsPlots

# Here are the heights and weights of babies (in metres and kilograms)
# Assuming the weight of the babies are generated by a biological
# process with a normal distribution with mean μ and standard dev σ
# Weight ~ Normal(μ,σ)
#          μ = α + β Height
# find the posterior distribution of those three parameters α , β , σ
# data[rownum,1] is Height (in metres)
# data[rownum,2] is Weight/Mass (in kilograms)
data = [
0.46 2.5;
0.51 3.4;
0.54 4.4;
0.57 5.1;
0.60 5.6;
0.62 6.1;
0.64 6.4;
0.65 6.7;
0.66 7.0;
0.68 7.2]

# rough estimate of α and β
# julia>  [ 1 0.6; 1 0.62 ] \ [5.6; 6.1]
# 2-element Array{Float64,1}:
#  -9.399999999999986
#  24.99999999999998

# define our prior for α as a uniform centered on -9.4 ( required for MCMC )
prior_α_distr = Distributions.Uniform(-20, 0)
prior_α(p) = Distributions.pdf(prior_α_distr,p)
α_lb = -20
α_ub = 0

# define our prior for β as a normal centered on 25 ( required for MCMC )
prior_β_distr = Distributions.Normal(25, 20)
prior_β(p) = Distributions.pdf(prior_β_distr,p)
β_lb = -35
β_ub = 65

# define our prior for σ as a exponential with mean of 2 ( required for MCMC )
prior_σ_distr = Distributions.Exponential(2)
prior_σ(p) = Distributions.pdf(prior_σ_distr,p)
σ_lb = 0
σ_ub = 10

# define the number of samples used in MCMC
# And create array p with that many elements (filled with 0.0)
n_samples = 1_000_000
p_α = zeros(n_samples)
p_β = zeros(n_samples)
p_σ = zeros(n_samples)
# Set the starting value to a random number between lb and ub
p_α = rand(prior_α_distr)
while p_α < α_lb || p_α > α_ub
p_α = rand(prior_α_distr)
end

p_β = rand(prior_β_distr)
while p_β < β_lb || p_β > β_ub
p_β = rand(prior_β_distr)
end

p_σ = rand(prior_σ_distr)
while p_σ < σ_lb || p_σ > σ_ub
p_σ = rand(prior_σ_distr)
end

log10(x) = log(10,x)

# The likelihood function which returns the probability,
# given a fixed mu and a fixed sigma, of the sequence of datapoints
function loglikelihood(data::Matrix,alpha::Real,beta::Real,sigma::Real)
numofdatapoints = size(data,1)
result=0.0
for k = 1:numofdatapoints
# μ = α + β * Height
mu = alpha + beta * data[k,1]
result += log10(  pdf(Normal(mu, sigma),data[k,2])  )
end
return result
end

JumpingWidth = 0.1
for i in 2:n_samples
# p_new has a value around the vicinity of p[i-1]
p_α_new = rand(  Normal( p_α[i-1] , JumpingWidth*(α_ub-α_lb) )  )
# make sure p_α_new is between α_lb and α_ub
if p_α_new < α_lb
p_α_new = α_lb + abs( p_α_new - α_lb )
elseif p_α_new > α_ub
p_α_new = α_ub - abs( p_α_new - α_ub )
end

# p_new has a value around the vicinity of p[i-1]
p_β_new = rand(  Normal( p_β[i-1] , JumpingWidth*(β_ub-β_lb) )  )
# make sure p_β_new is between β_lb and β_ub
if p_β_new < β_lb
p_β_new = β_lb + abs( p_β_new - β_lb )
elseif p_β_new > β_ub
p_β_new = β_ub - abs( p_β_new - β_ub )
end

p_σ_new = rand(  Normal( p_σ[i-1] , JumpingWidth*(σ_ub-σ_lb) )  )
# make sure p_σ_new is between σ_lb  and σ_ub
if p_σ_new < σ_lb
p_σ_new = σ_lb + abs( p_σ_new - σ_lb )
elseif p_σ_new > σ_ub
p_σ_new = σ_ub - abs( p_σ_new - σ_ub )
end
" q0 is posterior0 = likelihood0 * prior0        "
" q1 is posterior1 = likelihood1 * prior1        "
logpriorα0 = log10(  prior_α(p_α[i-1])  )
logpriorβ0 = log10(  prior_β(p_β[i-1])  )
logpriorσ0 = log10(  prior_σ(p_σ[i-1])  )
logpriorα1 = log10(  prior_α(p_α_new)   )
logpriorβ1 = log10(  prior_β(p_β_new)   )
logpriorσ1 = log10(  prior_σ(p_σ_new)   )

q0 = loglikelihood(data, p_α[i-1], p_β[i-1], p_σ[i-1]) + logpriorα0 + logpriorβ0 + logpriorσ0
q1 = loglikelihood(data, p_α_new, p_β_new, p_σ_new) + logpriorα1 + logpriorβ1 + logpriorσ1
# The value of p[i] depends on whether the
# random number is less than q1/q0
if log10(rand()) < q1-q0
p_α[i] = p_α_new
p_β[i] = p_β_new
p_σ[i] = p_σ_new
else
p_α[i] = p_α[i-1]
p_β[i] = p_β[i-1]
p_σ[i] = p_σ[i-1]
end

end

# Next we need to display the histogram of the array p
# this is the shape of the posterior produced by MCMC
histogram(p_α,legend=false,title="Posterior of alpha") |> display
histogram(p_β,legend=false,title="Posterior of beta") |> display
histogram(p_σ,legend=false,title="Posterior of sigma") |> display

``````

Thanks for putting these examples together. In order to achieve better numerical stability, I recommend using `logpdf(...)` instead of ` log10(pdf(...))`. The formulas for `logpdf(...)` are usually designed to be more numerically stable than `log(pdf(...))`.

1 Like

This is last post. There are two more innovations

1. standardise the rawdata by mean and standard deviation of the rawdata
2. use a burned in sampling of 1000 samples
``````using Distributions, Statistics
using StatsPlots

# Here are the heights and weights of babies (in metres and kilograms)
# Assuming the weight of the babies are generated by a biological
# process with a normal distribution with mean μ and standard dev σ
# (Weight - weight_mean)/rel_weight_stddev ~ Normal(μ,σ)
#          μ = α + β (Height - Height_mean)/rel_height_stddev
#
# Weight = weight_mean + rel_weight_stddev * Normal(μ,σ)
#      μ = α + β (Height - Height_mean)/rel_height_stddev
# find the posterior distribution of those three parameters α , β , σ
rawdata = [
0.46 2.5;
0.51 3.4;
0.54 4.4;
0.57 5.1;
0.60 5.6;
0.62 6.1;
0.64 6.4;
0.65 6.7;
0.66 7.0;
0.68 7.2]

height_mean = mean(rawdata[:,1])
weight_mean = mean(rawdata[:,2])
relativedata = rawdata .- [height_mean weight_mean]
rel_height_stddev = std(relativedata[:,1])
rel_weight_stddev = std(relativedata[:,2])
data = relativedata ./ [rel_height_stddev rel_weight_stddev]

# data[rownum,1] is Height (in Z score)
# data[rownum,2] is Weight/Mass (in Z score)
# julia> data
# 10×2 Array{Float64,2}:
#  -1.85612    -1.85925
#  -1.15833    -1.29009
#  -0.739655   -0.657695
#  -0.320982   -0.215016
#   0.0976903   0.101184
#   0.376805    0.417384
#   0.65592     0.607103
#   0.795478    0.796823
#   0.935035    0.986543
#   1.21415     1.11302

# rough estimate of α and β
#julia> [ 1 0.0976; 1 0.376] \ [ 0.10; 0.417]
#2-element Array{Float64,1}:
# -0.011132183908045976
#  1.1386494252873562

# define our prior for α as a uniform centered on 0.0 ( required for MCMC )
prior_α_distr = Distributions.Uniform(-10, 10)
prior_α(p) = Distributions.pdf(prior_α_distr,p)
α_lb = -10
α_ub = 10

# define our prior for β as a normal centered on 1 ( required for MCMC )
prior_β_distr = Distributions.Normal(1, 1)
prior_β(p) = Distributions.pdf(prior_β_distr,p)
β_lb = -2
β_ub = 4

# define our prior for σ as a exponential with mean of 2 ( required for MCMC )
prior_σ_distr = Distributions.Exponential(1)
prior_σ(p) = Distributions.pdf(prior_σ_distr,p)
σ_lb = 0
σ_ub = 5

# define the number of samples used in MCMC
# And create array p with that many elements (filled with 0.0)
burnedinsamples = 1000
n_samples = 1_000_000 + burnedinsamples
p_α = zeros(n_samples)
p_β = zeros(n_samples)
p_σ = zeros(n_samples)
# Set the starting value to a random number between lb and ub
p_α = rand(prior_α_distr)
while p_α < α_lb || p_α > α_ub
p_α = rand(prior_α_distr)
end

p_β = rand(prior_β_distr)
while p_β < β_lb || p_β > β_ub
p_β = rand(prior_β_distr)
end

p_σ = rand(prior_σ_distr)
while p_σ < σ_lb || p_σ > σ_ub
p_σ = rand(prior_σ_distr)
end

log10(x) = log(10,x)

# The likelihood function which returns the probability,
# given a fixed mu and a fixed sigma, of the sequence of datapoints
function loglikelihood(data::Matrix,alpha::Real,beta::Real,sigma::Real)
numofdatapoints = size(data,1)
result=0.0
for k = 1:numofdatapoints
# μ = α + β * Height
mu = alpha + beta * data[k,1]
result += log10(  pdf(Normal(mu, sigma),data[k,2])  )
end
return result
end

JumpingWidth = 0.1
for i in 2:n_samples
# p_new has a value around the vicinity of p[i-1]
p_α_new = rand(  Normal( p_α[i-1] , JumpingWidth*(α_ub-α_lb) )  )
# make sure p_α_new is between α_lb and α_ub
if p_α_new < α_lb
p_α_new = α_lb + abs( p_α_new - α_lb )
elseif p_α_new > α_ub
p_α_new = α_ub - abs( p_α_new - α_ub )
end

# p_new has a value around the vicinity of p[i-1]
p_β_new = rand(  Normal( p_β[i-1] , JumpingWidth*(β_ub-β_lb) )  )
# make sure p_β_new is between β_lb and β_ub
if p_β_new < β_lb
p_β_new = β_lb + abs( p_β_new - β_lb )
elseif p_β_new > β_ub
p_β_new = β_ub - abs( p_β_new - β_ub )
end

p_σ_new = rand(  Normal( p_σ[i-1] , JumpingWidth*(σ_ub-σ_lb) )  )
# make sure p_σ_new is between σ_lb  and σ_ub
if p_σ_new < σ_lb
p_σ_new = σ_lb + abs( p_σ_new - σ_lb )
elseif p_σ_new > σ_ub
p_σ_new = σ_ub - abs( p_σ_new - σ_ub )
end
" q0 is posterior0 = likelihood0 * prior0        "
" q1 is posterior1 = likelihood1 * prior1        "
logpriorα0 = log10(  prior_α(p_α[i-1])  )
logpriorβ0 = log10(  prior_β(p_β[i-1])  )
logpriorσ0 = log10(  prior_σ(p_σ[i-1])  )
logpriorα1 = log10(  prior_α(p_α_new)   )
logpriorβ1 = log10(  prior_β(p_β_new)   )
logpriorσ1 = log10(  prior_σ(p_σ_new)   )

q0 = loglikelihood(data, p_α[i-1], p_β[i-1], p_σ[i-1]) + logpriorα0 + logpriorβ0 + logpriorσ0
q1 = loglikelihood(data, p_α_new, p_β_new, p_σ_new) + logpriorα1 + logpriorβ1 + logpriorσ1
# The value of p[i] depends on whether the
# random number is less than q1/q0
if log10(rand()) < q1-q0
p_α[i] = p_α_new
p_β[i] = p_β_new
p_σ[i] = p_σ_new
else
p_α[i] = p_α[i-1]
p_β[i] = p_β[i-1]
p_σ[i] = p_σ[i-1]
end

end

# Finally we must not forget to remove the burned in samples
deleteat!(p_α,1:burnedinsamples)
deleteat!(p_β,1:burnedinsamples)
deleteat!(p_σ,1:burnedinsamples)

# Next we need to display the histogram of the array p
# this is the shape of the posterior produced by MCMC
histogram(p_α,legend=false,title="Posterior of alpha") |> display
histogram(p_β,legend=false,title="Posterior of beta") |> display
histogram(p_σ,legend=false,title="Posterior of sigma") |> display

``````

I would strongly suggest making functions for each section of the code. This should be both easier to digest and run much faster!

4 Likes

dpsanders,

Thank you for your input. I may rework the code and put them in functions for functional style programming. As these code are for beginners to understand. Faster performance is not an objective, rather ability for beginners to read and understand the code is the number one objective

In my opinion, having the code in digestible, small chunks inside functions with well chosen names makes any code much more readable and understandable.

2 Likes

In Soss, you would do

``````julia> using Soss

julia> data = [1, 0, 1, 0, 1, 1, 0, 1, 1];

julia> coinflip = @model N begin
p ~ Beta(1,1)
data ~ Bernoulli(p) |> iid(N)
end;
``````

Then (if you didn’t have data) you could generate some with

``````julia> rand(coinflip(N=10))
(N = 10, p = 0.5585410994609438, data = Bool[0, 0, 0, 1, 1, 1, 0, 1, 1, 0])
``````

Now for sampling

``````julia> post = dynamicHMC(coinflip(N=10), (data=data,)) |> particles
(p = 0.649 ± 0.13,)
``````
4 Likes

I have simplify the source code and place it inside a module called SimpleMCMC.jl
I also also implemented Gibbs_MCMC in addition to MetropolisHastings_MCMC
Hopefully this source code would allow beginners and newbies to figure out how basic MCMC works behind the scenes.

FILE: SimpleMCMC.jl

``````module SimpleMCMC
export MetropolisHastings_MCMC
export Gibbs_MCMC
export describe_paramvec

using Distributions, StatsPlots, Statistics, Printf

# Version 6
#  v6: Add Gibbs_MCMC and change module name from MHMCMC to SimpleMCMC
#  v5: turn it into a module
#  v4: remove prior from Parameter struct and the struct itself
#  v3: create JumpingWidthVec, lbVec, ubVec  to improve efficiency
#  v2: Remove q0 from the main loop to improve efficiency
#  v1: Turn it into a version that uses pure function(s)

function MetropolisHastings_MCMC(data,lop::Vector{Distribution},loglikelihood;samples,burnedinsamples,JumpingWidth=0.01)
# Find the number of parameters
numofparam = length(lop)
# calc the n_samples
n_samples = samples + burnedinsamples
# Create a parameter Matrix p with n_samples rows and numofparam cols
p = zeros(n_samples,numofparam)
# JumpingWidthVector, lowerboundVector and upperboundbVector
JumpingWidthVec = zeros(numofparam)
lbVec = zeros(numofparam)
ubVec = zeros(numofparam)
# Create a Vector of our logprior function
logpriorVec = Array{Function}(undef,numofparam)

# Set the starting value to a random number between lb and ub
for k = 1:numofparam  # for each parameter k
# Set the starting value to a random number between lb and ub
p[1,k] = rand(lop[k])

# get 100 random values and find the practical max and practical min
HundredRandValues = rand(lop[k],100)
prac_min = minimum(HundredRandValues)
prac_max = maximum(HundredRandValues)
JumpingWidthVec[k] = JumpingWidth * (prac_max - prac_min)

lbVec[k] = minimum(lop[k])
ubVec[k] = maximum(lop[k])

# Array of function prior(x) = pdf(distribution,x)
logpriorVec[k] = x->logpdf(lop[k],x)
end

p_old = vec(p[1,:])
q0 = loglikelihood(data,p_old...) + sum([logpriorVec[k](p_old[k]) for k = 1:numofparam ])

# prepare the p_new array
p_new = zeros(numofparam)
# Main loop for MetropolisHastings MCMC
for i = 2:n_samples
# p_new has a value around the vicinity of p[i-1]
for k = 1:numofparam  # for each parameter k
# p_new has a value around the vicinity of p[i-1]
p_new[k] = rand(  Normal( p[i-1,k] , JumpingWidthVec[k] )  )
# make sure p_new is between lb and ub
if p_new[k] < lbVec[k]
p_new[k] = lbVec[k] + abs( p_new[k] - lbVec[k] )
elseif p_new[k] > ubVec[k]
p_new[k] = ubVec[k] - abs( p_new[k] - ubVec[k] )
end
end

# Calc the two posterior
" q0 is posterior0 = likelihood0 * prior0        "
" q1 is posterior1 = likelihood1 * prior1        "
q1 = loglikelihood(data,p_new...) + sum([logpriorVec[k](p_new[k]) for k = 1:numofparam ])
# The value of p[i] depends on whether the
# random number is less than q1/q0
p[i,:] .= log(rand()) < q1-q0 ? (q0=q1; p_old=p_new[:]) : p_old
end
# Finally we must not forget to remove the burned in samples
return p[(burnedinsamples+1):end,:]
end

function Gibbs_MCMC(data,lop::Vector{Distribution},loglikelihood;samples,burnedinsamples)
# Find the number of parameters
numofparam = length(lop)
# calc the n_samples
n_samples = samples + burnedinsamples
# Create a parameter Matrix p with n_samples rows and numofparam cols
p = zeros(n_samples,numofparam)
# Create a Vector of our logprior function
logpriorVec = Array{Function}(undef,numofparam)
# Number of holes to drill
numofdrillholes = 1001
drillhole = Array{Float64}(undef,numofdrillholes)
invcdf = Array{Float64}(undef,numofdrillholes+1)
# prepare the vectors: startposvec stopposvec chunksizevec
startposvec = zeros(numofparam)
stopposvec = zeros(numofparam)
chunksizevec = zeros(numofparam)

# Set the starting value to a random number between lb and ub
for k = 1:numofparam  # for each parameter k
# Set the starting value to a random number between lb and ub
p[1,k] = rand(lop[k])

# get 400 random values and find the practical max and practical min
FourHundredRandValues = rand(lop[k],400)
startposvec[k] = minimum(FourHundredRandValues)
stopposvec[k]  = maximum(FourHundredRandValues)
chunksizevec[k] = (stopposvec[k]-startposvec[k])/(numofdrillholes-1)

# Array of function prior(x) = pdf(distribution,x)
logpriorVec[k] = x->logpdf(lop[k],x)
end

# Main loop for Gibbs MCMC
for i = 2:n_samples
# prepare the p_new vector
p_new = vec(p[i-1,:])
# Select a new value for each parameter of p_new
for k = 1:numofparam  # for each parameter k
# Calculate chunk size
startpos,stoppos,chunksize = startposvec[k],stopposvec[k],chunksizevec[k]
invcdf = 0.0
# Start drilling the drill holes for parameter k
for n = 1:numofdrillholes
p_new[k] = startpos + (n-1) * chunksize
drillhole[n] = exp(  loglikelihood(data,p_new...)  )
invcdf[n+1] = invcdf[n] + drillhole[n]
end
invcdf /= invcdf[end]
# Now we perform an inverse CDF sampling
x = 0.0
while x == 0.0
x = rand()  # Make sure x is not zero
end
counter = 2
while !(invcdf[counter-1] < x <= invcdf[counter])
counter += 1
end
counter -= 1
# now find the kth parameter value
pos = startpos + (counter-1) * chunksize
p_new[k] = pos
p[i,k] = pos
end
end
# Finally we must not forget to remove the burned in samples
return p[(burnedinsamples+1):end,:]
end

function describe_paramvec(name,namestr,v::Vector)
m,s,v_sorted = mean(v),std(v),sort(v)
pc5 = v_sorted[Int64(round(0.055*length(v_sorted)))]
pc94 = v_sorted[Int64(round(0.945*length(v_sorted)))]
println(
"Parameter \$(name)   mean ",@sprintf("%6.3f",round(m,sigdigits=3)),
"   sd ",@sprintf("%6.3f",round(s,sigdigits=3)),
"   5.5% ",@sprintf("%6.3f",round(pc5,sigdigits=3)),
"   94.5% ",@sprintf("%6.3f",round(pc94,sigdigits=3))
)
histogram(v,legend=false,title="Posterior of \$(namestr)") |> display
return (m,s,pc5,pc94)
end

end
``````

Here is an example code on how to use it

``````using Distributions, StatsPlots, Statistics
using SimpleMCMC

function MCMC_example()

# Here are the weights of the students (in kilograms) in a classrom
# Assuming the weight of the students are generated by a biological
# process with a normal distribution with mean μ and standard dev σ
# find the posterior distribution of those two parameters
data = [ 55.8, 64.1, 61.2, 55.2, 58.0, 69.8, 53.2, 70.0, 63.3, 63.5 ]

# define our prior for μ as a normal centered on 60 ( required for MCMC )
μ = Normal(60, 10)
# define our prior for σ as a exponential with mean of 20 ( required for MCMC )
σ = Exponential(20)

# Put all our parameters into a list
ListOfParams = Distribution[μ,σ]

# The loglikelihood function which returns the log(probability),
# given a fixed mu and a fixed sigma, of the sequence of datapoints
function loglikelihood(data::Vector,mu,sigma)
numofdatapoints = length(data)
result=0.0
for k = 1:numofdatapoints
result += logpdf(Normal(mu,sigma),data[k])
end
return result
end

p = MetropolisHastings_MCMC(
data,
ListOfParams,
loglikelihood,
samples=1_000_000,
burnedinsamples=5000
)

# Next we need to display the histogram of the column p[:,1]
# this is the shape of the posterior produced by MCMC
vec_μ=vec(p[:,1])  # Convert a Matrix column into a 1-D Vector
vec_σ=vec(p[:,2])  # Convert a Matrix column into a 1-D Vector
describe_paramvec("μ","mu",vec_μ)
describe_paramvec("σ","sigma",vec_σ)
histogram2d(vec_μ,vec_σ,nbins=100,xlabel="mu",ylabel="sigma")
end

# We call the main function to kick off the example
MCMC_example()
``````

Here is the text output

``````Starting Julia...
_
_       _ _(_)_     |  Documentation: https://docs.julialang.org
(_)     | (_) (_)    |
_ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _  |  |
| | |_| | | | (_| |  |  Version 1.3.0 (2019-11-26)
_/ |\__ _|_|_|\__ _|  |  Official https://julialang.org/ release
|__/                   |

[ Info: Precompiling SimpleMCMC [top-level]
Parameter μ   mean 61.300   sd  2.130   5.5% 57.900   94.5% 64.700
Parameter σ   mean  6.680   sd  1.850   5.5%  4.440   94.5%  9.960
``````

Here are the three graphs produced by the program

1 Like