I’m not very familiar with StatsModels, so I don’t have a complete answer, but my expectation is that the equality comparison is not (solely) comparing the structural similarity of the models, and that random effects terms from 2 different formulas can’t be guaranteed equal (they are random, after all) even if they are structurally similar.
What are you trying to achieve by comparing formulas?
Thanks for your reply !
I am doing simulation with many formula stored in a vector.
I want delete a true model from candidates if true model is inside the candidate.
PS. I may do this by converting formula to string. But I asked this because it is not intuitive ^^.
Are the formula generated or manually created? If the latter, it might be more convenient to store them in something with a named index kind of interface (e.g. Dict) with meaningful names for each formula.
I agree that it’s not very intuitive, and based on that my gut impression is that there might be better/more appropriate approaches than trying to directly compare formulas. Are you comparing nested models and/or doing a stepwise regression (e.g. with a likelihood ratio test)?
f1 = @formula(y ~ 1 + x + (1 | z3) )
dump(f1)
f2 = @formula(y ~ 1 + x + (1 | z3) )
dump(f2)
It looks like the FunctionTerm is different in the two formulas.
The documentation (API documentation · StatsModels.jl) indicates that the fanon from the FunctionTerm is a generated anon function. So I would guess that since the formulas are created twice, the FunctionTerms are generated twice too, and then the comparison is unable to detect that they are the same.
Yeah the problem is that (1 | z3) is interpreted as being a call to a custom function. It’s only after MixedModels calls apply_schema that these are detected as being a random effects term.
@dave.f.kleinschmidt Maybe == should ignore anonymous functions fields and only compare syntaxes?
Yeah that’s what should happen…the method for the FunctionTerm itself is actually there already, but the PR for general == of terms got bogged down so the formula term is checking ===. I’ll have a go at updating that PR (the person who opened it seems to have deleted their github account so I’ll have to open a new one: https://github.com/JuliaStats/StatsModels.jl/pull/241