Importance Sampling in Turing

ArnoStrouwen · August 14, 2024, 12:59pm

The summary from importance sampling is very different from NUTS, and just seems to summarize the prior distribution:

using Turing
@model function gdemo(x, y)
    s² ~ InverseGamma(2, 3)
    m ~ Normal(0, sqrt(s²))
    x ~ Normal(m, sqrt(s²))
    return y ~ Normal(m, sqrt(s²))
end

chn = sample(gdemo(1.5, 2), NUTS(), 10_000, progress=false)
chn2 = sample(gdemo(1.5, 2), IS(), 10_000, progress=false)

julia> chn1 = sample(gdemo(1.5, 2), NUTS(), 10_000, progress=false)
┌ Info: Found initial step size
└   ϵ = 3.2
Chains MCMC chain (10000×14×1 Array{Float64, 3}):

Iterations        = 1001:1:11000
Number of chains  = 1
Samples per chain = 10000
Wall duration     = 0.55 seconds
Compute duration  = 0.55 seconds
parameters        = s², m
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64       Float64 

          s²    2.0417    2.0415    0.0323   5093.6770   5398.2521    1.0000     9227.6757
           m    1.1658    0.8129    0.0116   5234.5603   5015.1256    1.0001     9482.8991

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

          s²    0.5564    1.0201    1.4965    2.3454    6.5310
           m   -0.4584    0.6915    1.1598    1.6477    2.8359


julia> chn2 = sample(gdemo(1.5, 2), IS(), 10_000, progress=false)
Chains MCMC chain (10000×3×1 Array{Float64, 3}):

Log evidence      = -3.716418865604326
Iterations        = 1:1:10000
Number of chains  = 1
Samples per chain = 10000
Wall duration     = 0.64 seconds
Compute duration  = 0.64 seconds
parameters        = s², m
internals         = lp

Summary Statistics
  parameters      mean       std      mcse     ess_bulk     ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64      Float64      Float64   Float64       Float64 

          s²    2.9386    4.5679    0.0445   10261.6859    9964.4439    0.9999    15934.2949
           m    0.0107    1.7156    0.0170   10143.6700   10127.8010    1.0001    15751.0404

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

          s²    0.5431    1.1107    1.7913    3.1765   12.6045
           m   -3.3421   -0.9112    0.0133    0.9038    3.4850

I understand that sampling from the prior is a part of the importance sampling algorithm, but I would expect the weights to be used in the summary to get to the posterior distribution.

Red-Portal · September 12, 2024, 12:57am

The weights are not stored so you’ll have to compute them yourself given the samples. IS is really just sampling from the prior.

Topic		Replies	Views
Simple model not working in Turing Statistics	3	881	June 25, 2018
Turing: Sampling from posterior parameters General Usage turing	2	949	October 25, 2021
[ANN] Turing.jl Breaking Changes Probabilistic Programming	1	1412	September 12, 2019
Different samples distribution after execution Probabilistic Programming question , turing	5	647	September 24, 2020
Turing.jl - NUTS/sample parameters Statistics turing	2	2977	October 19, 2019

Importance Sampling in Turing

Related topics