Importance Sampling in Turing

The summary from importance sampling is very different from NUTS, and just seems to summarize the prior distribution:

using Turing
@model function gdemo(x, y)
    s² ~ InverseGamma(2, 3)
    m ~ Normal(0, sqrt(s²))
    x ~ Normal(m, sqrt(s²))
    return y ~ Normal(m, sqrt(s²))
end

chn = sample(gdemo(1.5, 2), NUTS(), 10_000, progress=false)
chn2 = sample(gdemo(1.5, 2), IS(), 10_000, progress=false)
julia> chn1 = sample(gdemo(1.5, 2), NUTS(), 10_000, progress=false)
┌ Info: Found initial step size
└   ϵ = 3.2
Chains MCMC chain (10000×14×1 Array{Float64, 3}):

Iterations        = 1001:1:11000
Number of chains  = 1
Samples per chain = 10000
Wall duration     = 0.55 seconds
Compute duration  = 0.55 seconds
parameters        = s², m
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64       Float64 

          s²    2.0417    2.0415    0.0323   5093.6770   5398.2521    1.0000     9227.6757
           m    1.1658    0.8129    0.0116   5234.5603   5015.1256    1.0001     9482.8991

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

          s²    0.5564    1.0201    1.4965    2.3454    6.5310
           m   -0.4584    0.6915    1.1598    1.6477    2.8359


julia> chn2 = sample(gdemo(1.5, 2), IS(), 10_000, progress=false)
Chains MCMC chain (10000×3×1 Array{Float64, 3}):

Log evidence      = -3.716418865604326
Iterations        = 1:1:10000
Number of chains  = 1
Samples per chain = 10000
Wall duration     = 0.64 seconds
Compute duration  = 0.64 seconds
parameters        = s², m
internals         = lp

Summary Statistics
  parameters      mean       std      mcse     ess_bulk     ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64      Float64      Float64   Float64       Float64 

          s²    2.9386    4.5679    0.0445   10261.6859    9964.4439    0.9999    15934.2949
           m    0.0107    1.7156    0.0170   10143.6700   10127.8010    1.0001    15751.0404

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

          s²    0.5431    1.1107    1.7913    3.1765   12.6045
           m   -3.3421   -0.9112    0.0133    0.9038    3.4850

I understand that sampling from the prior is a part of the importance sampling algorithm, but I would expect the weights to be used in the summary to get to the posterior distribution.

The weights are not stored so you’ll have to compute them yourself given the samples. IS is really just sampling from the prior.