I slightly modifies the gdemo
example in Turing.jl
to accept more data to
# Import packages.
using Turing
# Define a simple Normal model with unknown mean and variance.
@model function gdemo(x)
s² ~ InverseGamma(2, 3)
m ~ Normal(0, sqrt(s²))
x .~ Normal(m, sqrt(s²))
end
and generated some data - actually lots of it - this is key for the observed behaviour.
data = rand(Normal(2,1),10000);
and wanted to sample from the posterior:
c3 = sample(gdemo(data), HMC(0.1, 5), 100)
Which resulted in lots of warning messages:
┌ Warning: The current proposal will be rejected due to numerical error(s).
│ isfinite.((θ, r, ℓπ, ℓκ)) = (true, false, false, false)
└ @ AdvancedHMC ~/.julia/packages/AdvancedHMC/4fByY/src/hamiltonian.jl:49
┌ Warning: The current proposal will be rejected due to numerical error(s).
│ isfinite.((θ, r, ℓπ, ℓκ)) = (true, false, false, false)
└ @ AdvancedHMC ~/.julia/packages/AdvancedHMC/4fByY/src/hamiltonian.jl:49
...
In the end the sampler is stuck in one place:
Chains MCMC chain (100×11×1 Array{Float64, 3}):
Iterations = 1:1:100
Number of chains = 1
Samples per chain = 100
Wall duration = 0.13 seconds
Compute duration = 0.13 seconds
parameters = s², m
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, step_size, nom_step_size
Summary Statistics
parameters mean std naive_se mcse ess rhat es ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
s² 2.2691 0.0000 0.0000 0.0000 2.0911 0.9899 ⋯
m 0.1653 0.0000 0.0000 0.0000 NaN NaN ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
s² 2.2691 2.2691 2.2691 2.2691 2.2691
m 0.1653 0.1653 0.1653 0.1653 0.1653
What is wrong? Why is Turing performing badly when there is a lot of data?