I’m working on a project trying to use `@formula`

in a fairly novel way, for example, if I wanted to use `GLM.jl`

to fit a linear model with two predictors and their interaction I would use something like this:

```
f = @formula(y ~ 1 + x_1*x_2)
model = glm(f, data, Normal(), IdentityLink())
```

Assuming I have a vector `β`

prespecified, what I would like to do is be able to extract the regression equation out of `f`

so that I could get a function that behaved like:

```
function f!(y::T, x_1::T, x_2::T, β::Vector{T}) where {T<:Real}
y = β[1] + β[2] * x_1 + β[3] * x_2 + β[4] * x_1 * x_2
return y
end
```

I need to do this so that I can have a function in terms of the predictors and the regression coefficients `β`

that I can pass to automatic differentiation tools like `ForwardDiff`

. I’m not even sure if this is possible using `StatsModels`

and Terms 2.0. If anyone had any idea about how to implement this I would be excited to see!

2 Likes

That’s actually basically what the `predict`

function does, except that it’s not in place, and takes a table/row instead of positional arguments. If you want to do that then you can do something like

```
function f_factory(f::FormulaTerm)
function f!(y, xs::NamedTuple, β)
y[] = dot(modelcols(f, xs), β)
end
end
```

(note that if you want to make it “in place” you need to pass a `Ref(y)`

or something, you can’t assign to a `Real`

in place)

The specific thing you’re asking for is harder, but you can do it (with some restrictions) by “re-packing” the x values into a named tuple, something like

```
function f_factory(f::FormulaTerm)
function f!(y, xs::NamedTuple, β)
y[] = dot(modelcols(f, xs), β)
end
names = (StatsModels.termvars(f.rhs)..., )
f!(y, xs, β) = f!(y, NamedTuple{names}((xs..., )), β)
end
```

However, this will only work with continuous values, because categorical variables get converted into some kind of numeric form that can’t just be stuffed back into a named tuple and run through `modelcols`

again.

I should add, both of those solutions assume you can take the `xs`

as a tuple or a vector or something. If you really need the variadic form then you could do something like

```
function f_factory(f::FormulaTerm)
function f!(y, xs::NamedTuple, β)
y[] = dot(modelcols(f, xs), β)
end
names = (StatsModels.termvars(f.rhs)..., )
f!(y, β::Vector, xs...) = f!(y, NamedTuple{names}((xs..., )), β)
end
```

and I fiiiinnnally read the last bit of your post and saw that you want this to use with autodiff and so none of these solutions are likely to work :-/ Although I suppose there’s no harm in trying.

The other thing I’d suggest is to write a loop inside the closure that iterates over the terms and accumulates the results