I have a DataFrame and need to build a model where the predictors follow some naming scheme.
For the example data below, suppose the scheme is “!=y”:
using DataFrames
using GLM
data = DataFrame(y=[22.1,20.1,7.1,9.1,1000,200],
x1=[1.1,2.1,3.1,4.1,10,100.2],
x2=[1,2,3,4.0,11.2,100.1])
Can anyone suggest a modification to Ex.2 below that would make the models in Ex.1 (ols1) and Ex.2 (ols2) equivalent?
Please note that while ols2 does not run, I’m looking for something of comparable terseness, if possible.
Ex 1:
ols1 = GLM.lm(@formula(y ~ x1 + x2), data)
y ~ 1 + x1 + x2
Coefficients:
────────────────────────────────────────────────────────────────────────────
Estimate Std. Error t value Pr(>|t|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────────
(Intercept) 84.4784 4.76541 17.7274 0.0004 69.3127 99.644
x1 -745.249 8.33275 -89.4361 <1e-5 -771.767 -718.73
x2 747.144 8.34614 89.5196 <1e-5 720.582 773.705
Ex.2
preds = Symbol.(names(data)[findall(names(data) .!= "y")])
2-element Array{String,1}:
"x1"
"x2"
ols2 = GLM.lm(@formula(y ~ preds), data)
ERROR: type NamedTuple has no field preds