DataFrame column names into GLM as variable names

I would like to use GLM to estimate 500 regressions like

beta[1] = coef(lm(@formula(AAPL ~ SP500), Stocks))[2]
beta[2] = coef(lm(@formula(AMGN ~ SP500), Stocks))[2]
ā€¦
beta[i] = coef(lm(@formula(col_i ~ SP500), Stocks))[2]

where Stocks is a Dataframe with 501 columns. Column names are symbols, but I believe that GLM does not accept symbols (or strings) as variable names.

How may I transform my vector of symbols of 500 column names in such a way that they are accepted by GLM?

Thanks in advance.

Sounds a bit like

anyway you want:

https://juliastats.org/StatsModels.jl/stable/formula/#Constructing-a-formula-programmatically

i.e. [coef(lm(term(i) ~ SP500, Stocks))[2] for i in names(Stocks[!, Not(:SP500)])]

2 Likes

Perfect! Thank you very much