I have been an R user and I can see where StatsModel’s `@formula`

macro came from -

R! `@formula(x ~ y - 1)`

where the `-1`

is to fit without an intercept was unintuitive to me, and also `a*b`

is the same as `a + b + a&b`

i.e. individual and interaction effects. That was also unintuitive, because I actually just wanted `a*b`

. I wonder if there’s a better language for a formula somewhere that someone can point me to? Maybe it’s implemented in Julia.

# Is there a better DSL (domain-specific language) for defining a formula in linear models?

**xiaodai**#1

**Tamas_Papp**#2

There is no need to rely on your intuition, just read the docs.

https://juliastats.github.io/StatsModels.jl/latest/formula/#Modeling-tabular-data-1

If you’re on master (post #71 Terms 2.0) you can wrap things in identity to block the special syntax (so `y ~ identity(a*b - 1)`

will regress y against one less than the product of a and b).

As a more general comment, there’s always a tension between supporting the DSL features that people who are super familiar with R expect but that others find really unintuitive. I personally hate the “include intercept by default, use 0 or -1 to block” thing but many people would be surprised if that was removed. In #71 the compromise I came up with is that subtypes of `AbstractStatisticalModel`

use the “classical” behavior, but others use the “obvious” behavior (only get what you explicitly ask for at least as far as an intercrpt/constant column goes).

And at an even more general level, one of the primary goals of #71 Terms 2.0 Son of Terms was to make the formula DSL something people could customize and build on top of, instead of a straight clone of the R formula DSL. You could even go as far as writing your own macro that doesn’t do any of the special syntax (is actually just `*`

that’s handled at the macro level) but still returns terms and get all the other benefits of the DSL

**xiaodai**#5

@dave.f.kleinschmidt thank you for your understanding and reasonable suggestions. Developing an alternative macro sounds like am interesting option

**piever**#6

I’ve come to appreciate when there is a simple way to do things that does not require macros and for that I think a big upgrade to StatsModel has been the possibility to cleanly construct a formula in an explicit, programmatic way: https://juliastats.github.io/StatsModels.jl/latest/formula/#Constructing-a-formula-programatically-1

In terms of simplifying the “macro-free” API, maybe it’d be useful to have a vararg `interaction`

function to create interaction terms and a `fullinteraction`

function for interaction terms with also partial interactions (names could probably be improved). Something like:

```
julia> using StatsModels, IterTools
julia> interaction() = ConstantTerm(1) # product of 0 terms
interaction (generic function with 1 method)
julia> interaction(args...) = mapreduce(term, &, args)
interaction (generic function with 2 methods)
julia> function fullinteraction(args...)
itr = Iterators.filter(!isempty, IterTools.subsets(args))
mapreduce(v -> interaction(v...), +, itr)
end
fullinteraction (generic function with 1 method)
julia> interaction(:a, :b, :c)
a(unknown) & b(unknown) & c(unknown)
julia> fullinteraction(:a, :b, :c)
a(unknown)
b(unknown)
a(unknown) & b(unknown)
c(unknown)
a(unknown) & c(unknown)
b(unknown) & c(unknown)
a(unknown) & b(unknown) & c(unknown)
```