I am trying to extract the model formula from a fitted GLM object, with little success. In the dummy example below, I would like to get the following back, for example: Y ~ X
[i.e. the information stored inside @formula( … )]
using DataFrames, GLM, Distributions
# Simulate some fake data
data = DataFrame(X=[1,2,3,4,5,6,7,8,9,10], Y=[2,4,7,11,10,6,4,5,7,8])
# Fit GLM
m1 = fit(GeneralizedLinearModel,
@formula(Y ~ X),
data,
Normal(),
GLM.IdentityLink())
I would like to store the formula in another variable (e.g. mod_formula = Y ~ X
), and then be able to pass this to another model (e.g. lm1 = fit(LinearModel, @formula(mod_formula)), programatically.
You can do formula(m1)
Note that you can just store the formula as an object f = @formula(y ~ x)
You can also construct a formula programmatically. See here.
1 Like
To expand a bit on @pdeffebach answer, the best way to approach this is going do depend on what you want to do with that formula. If you want to fit another model to the same data (or data with the same structure, e.g. all teh columns have the same types/unique values for categorical columns) then using f1 = formula(m1)
is great. However, if you have different types of data or different levels of categorical values, then you’ll want to first generate the formula and the hold onto the “non-concrete” form, like
f = @formula(Y ~ X)
m1 = fit(..., f, data, ...)
m2 = fit(..., f, data2, ...)
…or construct programmatically, like
f = term(:Y) ~ term(:X)
...