# Get MLE parameters (e.g., p-value, confidence intervals...) from a Turing model using Optim.jl

EDIT: There is now PR to add this feature to Turing

This might be a very silly question… but assuming the following simple linear model:

``````using Turing
using DataFrames
using LinearAlgebra

x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [0, 0.6, 1, 1.4, 2, 2.8, 3, 3.3, 4, 4.6]

@model function lm(y, x)
# Set variance prior.
σ² ~ truncated(Normal(0, 100); lower=0)
intercept ~ Normal(0, sqrt(3))
coefficient ~ TDist(3)

# Calculate all the mu terms.
mu = intercept .+ x * coefficient
y ~ MvNormal(mu, σ² * LinearAlgebra.I)
end
``````

One can estimate the MLE parameters using Optim.jl:

``````using Optim

map_estimate = optimize(lm(y, x), MAP())
``````
``````ModeResult with maximized lp of -0.84
[0.01558797630961074, 0.025930064026905057, 0.4986792139399884]
``````

In the Turing examples, this is mentioned as a mean to find starting values. However, I was wondering if this feature could be used for other purposes, in particular to make non-Bayesian parameter estimations.

Is it possible to compute for this Turing model other indices traditionally associated with ML estimation, such as confidence intervals, p-values, etc.? Perhaps implementing a bootstrapping procedure? Thanks for any thoughts!

1 Like

If you wanted to get standard errors (which give you confidence intervals, t-stats, etc.) you can use the information matrix which we extended for this purpose. To get the maximum likelihood estimate, you can do

``````using Turing, Optim, StatsBase

# Estimate the MLE
opt = optimize(lm(y,x), MLE())

# Then, get the standard errors
infomat = stderror(opt)
``````

You can also get a standard statistical coefficient table using `coeftable(opt)`.

4 Likes

Also worth mentioning this post that showcases the use of `coeftable()` on the output of optimize.

For those interested, here is an attempt at getting a more complete version (with p-values and CIs via a z-approximation) (also because I have issues with coeftable doesn’t work)

``````function coeftable2(model)
mle = Optim.optimize(model, MLE())
params = DataFrame(
Name=coefnames(mle),
Coef=mle.values,
StdError=stderror(mle)
)
params[!, :z] = params[!, :Coef] ./ params[!, :StdError]
params[!, :p] = 2 * (1 .- cdf.(Normal(0, 1), abs.(params[!, :z])))
params[!, :CI_low] = params[!, :Coef] .+ quantile(Normal(0, 1), 0.025) .* params[!, :StdError]
params[!, :CI_high] = params[!, :Coef] .+ quantile(Normal(0, 1), 0.975) .* params[!, :StdError]
return params
end
``````

Which works like this:

``````using Turing
using DataFrames
using LinearAlgebra
using StatsBase
using Distributions

function generate_data(a, b, v, nr_samples)
x = float.(collect(1:nr_samples))
y = a .* x .+ b .+ randn(nr_samples) .* sqrt(v)
return x, y
end

x, y = generate_data(0.3, 25.0, 100.0, 40)

@model function model_lm(y, x)
# Set variance prior.
σ² ~ truncated(Normal(0, 100); lower=0)
intercept ~ Normal(0, sqrt(3))
coefficient ~ TDist(3)

# Calculate all the mu terms.
mu = intercept .+ x * coefficient
y ~ MvNormal(mu, σ² * LinearAlgebra.I)
end

# MLE
model = model_lm(y, x)
coeftable2(model)
``````
``````3×7 DataFrame
Row │ Name         Coef        StdError   z        p            CI_low    CI_high
│ Symbol       Float64     Float64    Float64  Float64      Float64   Float64
─────┼────────────────────────────────────────────────────────────────────────────────
1 │ σ²           118.147     26.4185    4.47214  7.74422e-6   66.3679   169.927
2 │ intercept     22.2225     3.50273   6.34435  2.23373e-10  15.3573    29.0878
3 │ coefficient    0.482927   0.148884  3.24364  0.00118011    0.19112    0.774734
``````

Update for those who stumble on this thread later – if you update to Turing 0.26.4, you can now call `StatsBase.coeftable(chain)`.

@DominiqueMakowski put out a lovely PR to fix this, please thank them!

3 Likes