Hi @PharmCat it seems like it’s worth opening an issue on GLM to discuss this, which will catch the attention on the developers. In particular, R and Julia seem to deal differently with subj 8, which is abandoned in R but a value is estimated in Julia. This changes the residual degrees of freedom from 16 to 15. I don’t think that’s the full story though, as simply skipping subj 8 from the analysis does not realign the results.
I don’t think there is a function in Julia to optain the residual variance (there isn’t one in R either, as your example shows).
A quick note on presentation - it’s best to surround code by a block of triple backticks, and to include the using CSV, DataFrames, GLM, StatsModels part of the code. Also, CSV.read("12248_2014_9661_MOESM1_ESM.txt", delim = '\t') is the preferred syntax for reading DataFrames today.
Hello! Thank you for explanation! In this example design matrix is singular and R use QR decomposition with pivoting to get coefficients. And in this example Subj in nested in sequence, but R calculate df without direct settings. I So, should i make an issue on github?
Yes, I think so. It would be useful in the issue to mention that this dataset is from a collection specifically invented to present edge cases for assessing the robustness of statistical software implementations.
You can get the residual variance a.k.a. dispersion parameter using the unexported function dispersion: GLM.dispersion(lmobj.model, true). We should probably export it (and it already has a docstring): please file an issue in GLM.