Didn't get consistent result of ANOVA from R and Julia

Hello, everyone, I’m a new user of Julia, previously using R/Python. I’m trying to run an ANOVA with Julia but got different result from R.

The test data is extracted from Jamovi: test_data.csv - Pastebin.com

The R code is:

data <- read.csv('./data/test_data.csv')
data$cond <- as.factor(data$cond)
data$subj <- as.factor(data$subj)
anova(
  lm(formula = y ~ subj + cond + subj*cond, data = data)
)

The result is:

Analysis of Variance Table

Response: y
            Df Sum Sq Mean Sq F value    Pr(>F)
subj        49  13872  283.10 22.1259 < 2.2e-16 ***
cond         1    706  705.54 55.1421 1.466e-13 ***
subj:cond   49   3827   78.11  6.1046 < 2.2e-16 ***
Residuals 2900  37105   12.79
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Both Jamovi and Pure R returns the same result.

But while I’m trying to run an ANOVA in Julia with the following code:

using StatsModels, DataFrames, CSV, GLM, ANOVA;  
data = CSV.read("./data/test_data.csv");     
categorical!(data, :cond);
categorical!(data, :subj);
anova(lm(@formula(y~cond+subj+cond*subj),data))

It returns a quite different result:

4×6 DataFrame
│ Row │ Source      │ DF        │ SS        │ MSS       │ F         │ p           │
│     │ String      │ Abstract… │ Abstract… │ Abstract… │ Abstract… │ Abstract…   │
├─────┼─────────────┼───────────┼───────────┼───────────┼───────────┼─────────────┤
│ 1   │ cond        │ 1.0       │ 0.661981  │ 0.661981  │ 0.0517376 │ 0.820083    │
│ 2   │ subj        │ 49.0      │ 7980.08   │ 162.859   │ 12.7283   │ 9.31968e-90 │
│ 3   │ cond & subj │ 49.0      │ 3827.32   │ 78.1085   │ 6.10462   │ 2.22324e-35 │
│ 4   │ Residuals   │ 2900.0    │ 37105.4   │ 12.795    │ 0.0       │ 0.0         │

It’s quite confusing, I have no idea what happened to my code. I would be grateful if I could get your generous help :grinning:.

The ANOVA readme states

Important: Make sure to use EffectsCoding on all your predictors, or results won’t be meaningful.

Did you try this?

Oops, sorry, i didn’t follow the correct tutorial…:anguished:

The modified command is:

 anova(fit(LinearModel, @formula(y~cond+subj+cond*subj), test_data, contrasts = Dict(:cond => EffectsCoding(), :subj => EffectsCoding())))

The result is:

4×6 DataFrame
│ Row │ Source      │ DF        │ SS        │ MSS       │ F         │ p           │
│     │ String      │ Abstract… │ Abstract… │ Abstract… │ Abstract… │ Abstract…   │
├─────┼─────────────┼───────────┼───────────┼───────────┼───────────┼─────────────┤
│ 1   │ cond        │ 1.0       │ 705.541   │ 705.541   │ 55.1421   │ 1.46615e-13 │
│ 2   │ subj        │ 49.0      │ 13871.9   │ 283.1     │ 22.1259   │ 1.0026e-162 │
│ 3   │ cond & subj │ 49.0      │ 3827.32   │ 78.1085   │ 6.10462   │ 2.22324e-35 │
│ 4   │ Residuals   │ 2900.0    │ 37105.4   │ 12.795    │ 0.0       │ 0.0         │

Everything looks well now, thanks for your kindly help!

1 Like