Hi all, I’m not sure about the meaning of the pretty prints of GLM.jl’s models:

julia> model
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}
Base power mean (W) ~ 1 + mnemonic + APSR (s flag) + Is conditional + Barrel shift amount + Has barrel shift + Has immediate operand + mnemonic & Binary weight + mnemonic & Dest reg == source reg + Barrel shift amount & Has barrel shift
Coefficients:
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(Intercept) 0.0920357 0.00542503 16.96 <1e-63 0.0814018 0.10267
mnemonic: add -0.0673919 0.00549261 -12.27 <1e-33 -0.0781583 -0.0566255
mnemonic: and -0.00238934 0.00649065 -0.37 0.7128 -0.015112 0.0103334
mnemonic: asr -0.00307265 0.0124812 -0.25 0.8055 -0.0275377 0.0213924
mnemonic: b 0.0926901 0.00876247 10.58 <1e-25 0.0755143 0.109866
mnemonic: bfc -0.00570581 0.008104 -0.70 0.4814 -0.0215909 0.0101793
mnemonic: bfi -0.00730015 0.00703384 -1.04 0.2994 -0.0210876 0.00648728
mnemonic: bic -0.00183137 0.00701267 -0.26 0.7940 -0.0155773 0.0119146
mnemonic: bl 0.150185 0.00876532 17.13 <1e-64 0.133004 0.167367

Unfortunately, the package documentation is practically non existent. What are the null and alternative hypothesis for the t-tests in the output? Is the null hypothesis β = 0 or β ≠ 0? What I am really interested in is: do the Pr(>|t|) column express the probability of the coefficient being null or not null? Thanks!

The null hypothesis is that the coefficient is zero. This is standard in regression models so you should be able to find lots of references.

(Note that this is not the probability that the coefficient is nonzero: as usual in frequentist statistics, the p-value is defined as the probability of getting a coefficient with an absolute value at least as large as the one reported here, assuming the null hypothesis is true.)

Thanks! Let me see if I got it right (my statistics knowledge is a bit rusty): the p-value is the probability that the test statistic is higher than t due to randomness. I don’t remember exactly how t and the test statistic are calculated, however in this API:

Pr(>|t|) is the p-value, that is the probability of an error of the first type

the lower the Pr(>|t|), the lower the probability of error.

assuming the usual level of significance of 0.05 (95%), we can “accept” the coefficients with Pr(>|t|) lower than 0.05 as being valid (i.e., different from 0, i.e., rejecting the null hypothesis)

The t-tests are Wald tests on the coefficients, with null hypothesis that the coefficient is equal to zero (as @nalimilan stated).

Pr |>t| is “just” the p-value (using how a p-value is computed from the test statistic).

The p-values is the probability under the null hypothesis of seeing a test statistic at least as extreme (hence Pr |>t|) as the observed test statistic. It is not the probability of any type of error or any hypothesis: that type of statement isn’t possible under the frequentist perspective. Long-term (as test replications go to infinity), you should only see p < 0.05 in about 5% percent of tests assuming the null hypothesis is true. Note that that is statement about long run frequencies and not about any individual test or hypothesis. Moreover, it’s a statement under the null hypothesis, which is “challenging” for (at least) two reasons:

If you declare a result significant, then you reject the null hypothesis, so why are you talking about statements made under the assumption of the null hypothesis?

In many application domains, true effects are rarely exactly zero, so with sufficient power you’ll almost always have a “significant” result, i.e. be able to reject the null hypothesis. But that doesn’t tell you much about the practical significance (as opposed to statistical significance) of the result. This rapidly veering off of a Julia-specific question and getting into general stats questions, for which I would recommend other fora (e.g. CrossValidated).

(And all of this is ignoring questions about whether the statistical model is correctly specified, has its assumptions met, etc.)

Thanks! You’re right, this is going wildly off topic, I apologize. Just one thing:

Isn’t this the same as a type I error? I mean, supposing I got 0.05 as p-value, that means that there is a 0.05 probability of having H0 true, and yet to see data that are statistically significant, i.e. that lead me to reject H0? I mean, they are differently defined, but doesn’t the first definition imply the second…?

Not quite. Your significance threshold sets your nominal Type-I error rate, but it doesn’t tell you anything about the probability of having a Type-I error for a particular test. The general rule of thumb to understand this weirdness is: frequentists make probability statements about long-run frequencies, but any particular analysis is a single thing and not a repeated thing. You need repetitions to have long-run frequencies. Thus a single analysis doesn’t have an associated frequency and doesn’t have an associated probability.