Regression with variables in array

I apologize for reheating this old topic, but how do I deal with this when both variables are dependent on a list? I tried using term, but was unable to. The only way I was able to receive meaningful output was by using eval.

I am trying to use RegressionTables.jl to turn it into format that I am able to export, but to no avail so far.

for i in list
		for j in list_2
			calc = (reg(DF1, @eval @formula($i ~ $j)))
			print(calc)
		end
	end

It’s not clear to me what list and list_2 are, and the fact that you are calling reg suggests that you are using some additional package (although as long as it relies on StatsModels you’ll be fine).

The case when both right hand side and left hand side variables come from some sort of iterator is a straightforward extension of what I posted above:

julia> using DataFrames, GLM

julia> df = DataFrame([:y1, :y2, :x1, :x2, :x3] .=> eachcol(rand(3, 5)))
3Γ—5 DataFrame
 Row β”‚ y1        y2        x1        x2         x3       
     β”‚ Float64   Float64   Float64   Float64    Float64  
─────┼───────────────────────────────────────────────────
   1 β”‚ 0.406401  0.974452  0.57843   0.553727   0.816638
   2 β”‚ 0.429666  0.873084  0.808852  0.0404561  0.728702
   3 β”‚ 0.998661  0.865624  0.269401  0.6146     0.630912

julia> rhs = ["y1", "y2"]; lhs = ["x1", "x2", "x3"];

julia> for y ∈ rhs
           for x ∈ lhs
               println(lm(term(y) ~ term(x), df))
           end
       end
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y1 ~ 1 + x1

Coefficients:
────────────────────────────────────────────────────────────────────────
                Coef.  Std. Error      t  Pr(>|t|)  Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept)   1.22034    0.336678   3.62    0.1714   -3.05755    5.49824
x1           -1.10239    0.566025  -1.95    0.3020   -8.29441    6.08964
────────────────────────────────────────────────────────────────────────
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y1 ~ 1 + x2

Coefficients:
───────────────────────────────────────────────────────────────────────
                Coef.  Std. Error     t  Pr(>|t|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────
(Intercept)  0.374734    0.423848  0.88    0.5391   -5.01076    5.76023
x2           0.587803    0.886367  0.66    0.6272  -10.6746    11.8502
───────────────────────────────────────────────────────────────────────
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y1 ~ 1 + x3

Coefficients:
────────────────────────────────────────────────────────────────────────
                Coef.  Std. Error      t  Pr(>|t|)  Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept)   2.96036     1.16501   2.54    0.2387   -11.8425    17.7632
x3           -3.23784     1.59728  -2.03    0.2918   -23.5332    17.0575
────────────────────────────────────────────────────────────────────────
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y2 ~ 1 + x1

Coefficients:
────────────────────────────────────────────────────────────────────────
                 Coef.  Std. Error     t  Pr(>|t|)  Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept)  0.886426     0.132183  6.71    0.0942  -0.793119    2.56597
x1           0.0325252    0.222227  0.15    0.9075  -2.79114     2.85619
────────────────────────────────────────────────────────────────────────
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y2 ~ 1 + x2

Coefficients:
─────────────────────────────────────────────────────────────────────────
                 Coef.  Std. Error      t  Pr(>|t|)  Lower 95%  Upper 95%
─────────────────────────────────────────────────────────────────────────
(Intercept)  0.876624    0.0860848  10.18    0.0623  -0.217188    1.97044
x2           0.0689036   0.180024    0.38    0.7673  -2.21852     2.35632
─────────────────────────────────────────────────────────────────────────
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y2 ~ 1 + x3

Coefficients:
───────────────────────────────────────────────────────────────────────
                Coef.  Std. Error     t  Pr(>|t|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────
(Intercept)  0.4862      0.225802  2.15    0.2768   -2.38289    3.35529
x3           0.576478    0.309584  1.86    0.3137   -3.35716    4.51012
───────────────────────────────────────────────────────────────────────

As a general rule of thumb, if you find yourself using @eval you’re likely doing it wrong.

1 Like

Basically I have a dataframe with two blocks of data that should be iterated with each other individually.

list and list_2 are referring to those two blocks. reg is a command used in RegressionTables.jl which is based on GLM.jl.

Thank you very much, the performance is so much better with your solution.