Using GLM programmatically

I would like to perform linear regression in a loop with variables. For example,

using DataFrames, GLM

data = DataFrame(X=[1,2,3], Y=[2,4,7])
var1 = :Y 
var2 = :X 
ols = lm(@formula($var1 ~ $var2), data)

I tried various approaches including the one proposed here, which no longer works.

1 Like

Just add @eval in front

ols = lm(@eval(@formula($var1 ~ $var2)), data)

Thank you! I tried something similar:

ols = lm(@eval @formula($var1 ~ $var2) , data)

I didn’t realize the parentheses were necessary.

Ah, if you don’t add parenthesis then @eval @formula($var1 ~ $var2) , data doesn’t make sense for @eval, cos of the , which is captured by @eval.

1 Like

Thanks for the explanation. At some point I should really learn how macros work.

1 Like

Just to add that the recommended approach to programmatically generating formula terms is by constructing the Term objects directly, as referenced in the StatsModels API here: Modeling tabular data · StatsModels.jl

7 Likes