I cannot seem to figure out how to run a LinReg
Onlinestat on my JuliaDB table. Especially, how to tell the reduce
function what the left-hand-side and right-hand-side variables are. The following runs a regression of y
on x
and z
is ignored:
t = table(@NT(x = randn(1000), y = randn(1000), z = rand(1000)))
reduce(LinReg(), t, select = (:x, :y, :z))
1 Like
You can also use LinRegBuilder
, which lets you fit any regression model on the data after a single pass.
o = reduce(LinRegBuilder(), t, select = (:x, :y, :z))
# y ~ x + z
coef(o, x=[1,3], y=2, bias=false)
LinRegBuilder seem cool. Can it be used to create any arbitrary model compatible with StatsModels
?
Can it do categorical variables/dummy variables?
In order to fit any given term (dummy variable, interaction term, etc.), you would need to specifically select
it. There’s no support (yet) for formulas.
Here’s an example of making an interaction term between :x
and :z
, which I’ll admit isn’t the cleanest syntax.
reduce(LinRegBuilder(), t, select=(:x, :y, :z, (:x, :z) => xz -> *(xz...)))
1 Like