How to draw simulated data from a given turing model given specific parameters

neuro_enthusiast · June 27, 2022, 4:19pm

I was wondering how I can simulate data given parameter values using a turing model? Any help would be great!

ex.

using Turing, Distributions
@model function logistic_regression(x, y, n, σ)
    intercept ~ Normal(0, σ)

    student ~ Normal(0, σ)
    balance ~ Normal(0, σ)
    income ~ Normal(0, σ)

    for i in 1:n
        v = logistic(intercept + student * x[i, 1] + balance * x[i, 2] + income * x[i, 3])
        y[i] ~ Bernoulli(v)
    end
end;

I would like to generate y after setting student =1, balance = 1, intercept = 1, income = 1 and x = [1,1,1] without having to re-code the model.

jae0 · June 28, 2022, 7:26pm

Don’t think you need Turing for that.

All your params are fixed, so just compute v.

Then,

using Distributions
rand( Bernoulli(v), 10) # or however many samples you want

neuro_enthusiast · June 29, 2022, 12:47am

I’m just using Bayesian logistic regression as an example. My actual model is hierarchical. I’m trying to do parameter recovery from simulated data to check the fitting process. Life would be much easier if I can utilize the turing model for data simulation. Functions for prior predictive checks would help if you can direct me to them but ideally I’d like functions where I can simulate data with parameters that I chose. Thanks for the suggestion though.

jae0 · July 2, 2022, 3:04pm

ok, sounds like you want to use Turing’s predict function:

copied from their help:

julia> using Turing; Turing.setprogress!(false);
  [ Info: [Turing]: progress logging is disabled globally
  
  julia> @model function linear_reg(x, y, σ = 0.1)
             β ~ Normal(0, 1)
  
             for i ∈ eachindex(y)
                 y[i] ~ Normal(β * x[i], σ)
             end
         end;
  
  julia> σ = 0.1; f(x) = 2 * x + 0.1 * randn();
  
  julia> Δ = 0.1; xs_train = 0:Δ:10; ys_train = f.(xs_train);
  
  julia> xs_test = [10 + Δ, 10 + 2 * Δ]; ys_test = f.(xs_test);
  
  julia> m_train = linear_reg(xs_train, ys_train, σ);
  
  julia> chain_lin_reg = sample(m_train, NUTS(100, 0.65), 200);
  ┌ Info: Found initial step size
  └   ϵ = 0.003125
  
  julia> m_test = linear_reg(xs_test, Vector{Union{Missing, Float64}}(undef, length(ys_test)), σ);
  
  julia> predictions = predict(m_test, chain_lin_reg)
  Object of type Chains, with data of type 100×2×1 Array{Float64,3}
  
  Iterations        = 1:100
  Thinning interval = 1
  Chains            = 1
  Samples per chain = 100
  parameters        = y[1], y[2]
  
  2-element Array{ChainDataFrame,1}
  
  Summary Statistics
    parameters     mean     std  naive_se     mcse       ess   r_hat
    ──────────  ───────  ──────  ────────  ───────  ────────  ──────
          y[1]  20.1974  0.1007    0.0101  missing  101.0711  0.9922
          y[2]  20.3867  0.1062    0.0106  missing  101.4889  0.9903
  
  Quantiles
    parameters     2.5%    25.0%    50.0%    75.0%    97.5%
    ──────────  ───────  ───────  ───────  ───────  ───────
          y[1]  20.0342  20.1188  20.2135  20.2588  20.4188
          y[2]  20.1870  20.3178  20.3839  20.4466  20.5895
  
  
  julia> ys_pred = vec(mean(Array(group(predictions, :y)); dims = 1));
  
  julia> sum(abs2, ys_test - ys_pred) ≤ 0.1
  true