Comparison formula with random effect term [StatsModels.jl]

Yonghee_Lee · September 20, 2021, 5:18pm

The follwing comparision between two formula with random effect term for MixedModels.jl gives false:

@formula(y ~ 1 + x + (1 | z3) ) == @formula(y ~ 1 + x + (1 | z3) )

However, a comparision formula without random effect term gives true, as expected.

@formula(y ~ 1 + x ) == @formula(y ~ 1 + x )

Is there any solution for this?

halleysfifthinc · September 20, 2021, 5:33pm

I’m not very familiar with StatsModels, so I don’t have a complete answer, but my expectation is that the equality comparison is not (solely) comparing the structural similarity of the models, and that random effects terms from 2 different formulas can’t be guaranteed equal (they are random, after all) even if they are structurally similar.

What are you trying to achieve by comparing formulas?

Yonghee_Lee · September 20, 2021, 5:50pm

Thanks for your reply !
I am doing simulation with many formula stored in a vector.
I want delete a true model from candidates if true model is inside the candidate.

PS. I may do this by converting formula to string. But I asked this because it is not intuitive ^^.

halleysfifthinc · September 20, 2021, 6:25pm

Are the formula generated or manually created? If the latter, it might be more convenient to store them in something with a named index kind of interface (e.g. Dict) with meaningful names for each formula.

I agree that it’s not very intuitive, and based on that my gut impression is that there might be better/more appropriate approaches than trying to directly compare formulas. Are you comparing nested models and/or doing a stepwise regression (e.g. with a likelihood ratio test)?

Yonghee_Lee · September 20, 2021, 7:39pm

Thanks, a lot

I now am tryng to use Dictionary or Tuple to make it work ^^.
Simulation work is about model averaging in mixed models.

jzr · September 20, 2021, 7:41pm

I think that comparison should return true. I would file an issue at MixedModels.jl.

Eric · September 20, 2021, 11:13pm

Executing:

f1 = @formula(y ~ 1 + x + (1 | z3) )
dump(f1)
f2 = @formula(y ~ 1 + x + (1 | z3) )
dump(f2)

It looks like the FunctionTerm is different in the two formulas.
The documentation (API documentation · StatsModels.jl) indicates that the fanon from the FunctionTerm is a generated anon function. So I would guess that since the formulas are created twice, the FunctionTerms are generated twice too, and then the comparison is unable to detect that they are the same.

nalimilan · September 21, 2021, 7:55am

Yeah the problem is that (1 | z3) is interpreted as being a call to a custom function. It’s only after MixedModels calls apply_schema that these are detected as being a random effects term.

@dave.f.kleinschmidt Maybe == should ignore anonymous functions fields and only compare syntaxes?

dave.f.kleinschmidt · September 21, 2021, 2:42pm

Yeah that’s what should happen…the method for the FunctionTerm itself is actually there already, but the PR for general == of terms got bogged down so the formula term is checking ===. I’ll have a go at updating that PR (the person who opened it seems to have deleted their github account so I’ll have to open a new one: https://github.com/JuliaStats/StatsModels.jl/pull/241

Yonghee_Lee · September 21, 2021, 3:05pm

Thanks so much

dave.f.kleinschmidt · September 21, 2021, 3:07pm

thanks for the nudge I’d forgotten about that PR

Topic		Replies	Views
Using all independent variables with @formula in a multiple linear model New to Julia glm	18	4420	January 29, 2023
Different syntax for MixedModels? What's the difference? Statistics	2	514	January 19, 2019
Comparing functions with their alias in GLM.jl General Usage question , package , glm	13	905	October 4, 2020
UndefVarError: @formula not defined New to Julia question , dataframes	4	491	April 8, 2023
Build a formula from a string General Usage statistics	5	1697	February 25, 2021

Comparison formula with random effect term [StatsModels.jl]

Related topics