Hello everyone,

I was wondering if any of the popular model selection criteria have been implemented in Turing? for example WAIC or LOO, etc? I am surprised that I can’t find mention of these in this forum or in the docs!

Thank you!

Hello everyone,

I was wondering if any of the popular model selection criteria have been implemented in Turing? for example WAIC or LOO, etc? I am surprised that I can’t find mention of these in this forum or in the docs!

Thank you!

1 Like

Turing’s support for model comparison could use some extending. If you’re interested in opening a PR over in the MCMCChains repo (which we would think of as being the place to store model comparison stuff) I would be happy to help out.

I know that Seth Axen recently extended Python’s excellent arviz package to work with Turing (and all MCMCChains compatible chain objects). arviz supports WAIC and some other stuff. The package is still a WIP so I don’t know how much functionality there is on the statistics side.

1 Like

I have LOO in my package feel free to copy pasta it, but really you could write a cleaner version in probably 10 minutes(I know I could)

edit - Ew python! Someone just write it in Julia!

I’ve been taking a look at @sethaxen 's work and it looks amazing (on behalf of Bayesians everywhere: thank you Seth! )

I see that ArviZ does indeed wrap some of the functions I was thinking of (`waic()`

and `loo()`

for example) It seems that I am missing a step.

when I follow the example from the ArviZ readme, and then try to run `waic(data)`

I get an error saying there is no log likelihood:

```
using ArviZ, PyPlot, Turing
ArviZ.use_style(["default", "arviz-darkgrid"])
# Turing model
@model school8(J, y, sigma) = begin
mu ~ Normal(0, 5)
tau ~ Truncated(Cauchy(0, 5), 0, Inf)
theta ~ Normal(mu, tau)
for j = 1:J
y[j] ~ Normal(theta, sigma[j])
end
end
J = 8
y = [28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0]
sigma = [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]
model_fun = school8(J, y, sigma)
sampler = NUTS(0.8)
chn = mapreduce(c -> sample(model_fun, sampler, 1000), chainscat, 1:4) # 4 chains
data = convert_to_inference_data(chn) # convert MCMCChains.Chains to InferenceData
summary(data) # show summary statistics
waic(data)
loo(data)
```

specifically the error says `data must include log_likelihood in sample_stats`

.

I’d really appreciate any advice!

For LOO and WAIC to work, you need to save the log likelihood values of your individual samples. I don’t know Turing, but in Stan you would put something like this in the generated quantities block:

```
vector[J] log_likelihood;
for (j in 1:J) {
log_likelihood[j] = normal_lpdf(y[j] | theta, sigma[j]);
}
```

I would also suggest using LOO instead of WAIC as it has better diagnostics.

2 Likes

Thanks for the kind words and for using ArviZ.jl! It now supports all of ArviZ’s API and as of yesterday has some usage docs.

@daniel is right about `log_likelihood`

being a generated quantity in Stan. Turing doesn’t have this yet, though from what I understand, the Turing devs are working on it.

We had a discussion on the Julia slack workspace last week, and Jonas Mackerodt (@Jonas ?) shared this trick for getting the log likelihood and predictive values that I haven’t been able to try in detail yet. Perhaps it will be useful

```
using Turing
@model school8(J, y, sigma, gen=false) = begin
mu ~ Normal(0, 5)
tau ~ Truncated(Cauchy(0, 5), 0, Inf)
eta = tzeros(J)
eta ~ [Normal(0.0, 1.0)]
theta = mu .+ tau .* eta
if gen == true
y_hat = Vector{Real}(undef, J)
log_lik = Vector{Real}(undef, J)
for j = 1:J
dist = Normal(theta[j], sigma[j])
y[j] ~ dist
y_hat[j] = rand(dist)
log_lik[j] = logpdf(dist, y[j])
end
return (mu=mu, tau=tau, eta=eta, theta=theta, y_hat=y_hat, log_lik=log_lik)
end
end
@model school8_mv(J, y, sigma, gen=false) = begin
mu ~ Normal(0, 5)
tau ~ Truncated(Cauchy(0, 5), 0, Inf)
#eta = tzeros(J)
eta ~ MvNormal(fill(0.0,J), fill(1.0,J))
theta = mu .+ tau .* eta
if gen == true
y_hat = Vector{Real}(undef, J)
log_lik = Vector{Real}(undef, J)
for j = 1:J
dist = Normal(theta[j], sigma[j])
y[j] ~ dist
y_hat[j] = rand(dist)
log_lik[j] = logpdf(dist, y[j])
end
return (mu=mu, tau=tau, eta=eta, theta=theta, y_hat=y_hat, log_lik=log_lik)
end
end
J = 8;
y = [28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0];
sigma = [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0];
model_fun = school8(J, y, sigma);
model_fun_mv = school8_mv(J, y, sigma);
#model_fun() # correctly draws a NamedTuple
chns = sample(model_fun, NUTS(2000,0.8), 4000) # lacks theta, y_hat, and log_lik
chns_mv = sample(model_fun_mv, NUTS(2000,0.8), 4000) # lacks theta, y_hat, and log_lik
mixeddensity(chns,[:mu, :tau])
mixeddensity!(chns_mv,[:mu, :tau])
function get_nlogp(model,sm)
# Set up the model call, sample from the prior.
vi = Turing.VarInfo(model)
# Define a function to optimize.
function nlogp(sm)
spl = Turing.SampleFromPrior()
new_vi = Turing.VarInfo(vi, spl, sm)
a=model(new_vi, spl)
return(a)
end
return nlogp(sm)
end
model_fun_ppc = school8(J, y, sigma, true)
model_fun_mv_ppc = school8_mv(J, y, sigma, true)
mu_post = Array(chns[:mu])
tau_post = Array(chns[:tau])
eta_post = Array(chns[:eta])
mu_post_mv = Array(chns_mv[:mu])
tau_post_mv = Array(chns_mv[:tau])
eta_post_mv = Array(chns_mv[:eta])
get_nlogp(model_fun_ppc,vcat(mu_post[1],tau_post[1],eta_post[1,:]))
get_nlogp(model_fun_mv_ppc,vcat(mu_post_mv[1],tau_post_mv[1],eta_post_mv[1,:]))
```

thanks Seth for a swift reply! I’m *very* happy to know how to get generated quantities out of a Turing model!

I wonder if obtaining the log-likelihood is only half the battle. I would much prefer to not have to create my own LOO-PSIS function, and rely instead upon ArviZ.

I’ve been trying this using the Stan interface. But to my surprise, even when I change the name of the generated quantity to `log_likelihood`

(which ArviZ seems to expect), `loo`

and `waic`

seem not to detect it

```
using CmdStan
using ArviZ
set_cmdstan_home!(homedir() * "/cmdstan/")
J = 8
y = [28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0]
sigma = [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]
schools = [
"Choate",
"Deerfield",
"Phillips Andover",
"Phillips Exeter",
"Hotchkiss",
"Lawrenceville",
"St. Paul's",
"Mt. Hermon"
];
schools_code = """
data {
int<lower=0> J;
real y[J];
real<lower=0> sigma[J];
}
parameters {
real mu;
real<lower=0> tau;
real theta[J];
}
model {
mu ~ normal(0, 5);
tau ~ cauchy(0, 5);
theta ~ normal(mu, tau);
y ~ normal(theta, sigma);
}
generated quantities {
vector[J] log_likelihood;
vector[J] y_hat;
for (j in 1:J) {
log_likelihood[j] = normal_lpdf(y[j] | theta[j], sigma[j]);
y_hat[j] = normal_rng(theta[j], sigma[j]);
}
}
"""
schools_dat = Dict("J" => J, "y" => y, "sigma" => sigma)
stan_model = Stanmodel(
model = schools_code,
nchains = 4,
num_warmup = 1000,
num_samples = 1000,
)
_, stan_chns, _ = stan(stan_model, schools_dat, summary = false);
stan_infdata = convert_to_inference_data(stan_chns)
waic(stan_infdata)
```

This last line gives the same error as previously. Is there something obvious that I’m not getting?

You’re quite welcome!

`convert_to_inference_data`

is a pretty feature-sparse conversion of a data type to an `InferenceData`

. In this case, it takes the data in `stan_chns`

and converts it. Stan’s output doesn’t differentiate between generated quantities and parameters and moreover between log likelihood and posterior predictions. If you want generated quantities to be treated specially in ArviZ, you’ll want to use a specialized conversion function.

In ArviZ, these start with `from_`

. `from_mcmcchains`

or `from_cmdstan`

will work for you. Check out the worked example using CmdStan on 8 schools in the quickstart for the signature.

1 Like