How to define a formula with a vector of string names for GLM.jl?

I got a DataFrame that I wanted to fit a GLM model with but I would like to be able to build a formula (with hundreds of variables) programmatically without having to write it out by hand. So the best I can think of is via eval. Is there a better way?

vars = ["x1", "x2", "x3"]

terms = eval(Meta.parse("@formula(y~"*reduce((x,y)->x*"+"*y, vars)*")"))

glm(terms, data = data)

See Constructing a formula programmatically in the docs.

Not 100% sure it works with strings, but it definitely works with Symbols.

1 Like

If you have a string but need a symbol use:

julia> Symbol("x")
:x

See my answer here: Using all independent variables with @formula in a multiple linear model - #2 by nilshg

Basically what the preceding posts say - you want to construct Term objects from symbols, in the other thread I’m doing this for all columns names(df) but that of course easily generalizes to a vector holding whatever subset of columns you’re interested in.

2 Likes