I want to build a StatsModels formula from a string. This is the closest I’ve gotten.

```
text = "y ~ x"
StatsModels.terms!(StatsModels.sort_terms!(StatsModels.parse!(Meta.parse(text))))
# :(($(Expr(:escape, :~)))(Term(:y), Term(:x)))
```

I want to build a StatsModels formula from a string. This is the closest I’ve gotten.

```
text = "y ~ x"
StatsModels.terms!(StatsModels.sort_terms!(StatsModels.parse!(Meta.parse(text))))
# :(($(Expr(:escape, :~)))(Term(:y), Term(:x)))
```

Could you explain why it’s useful to build from a string?

Since StatsModels.jl shares some syntax with other languages, I can use the same formula file from multiple tools.

If your formulas are very basic (just multiple linear regression), you can just split your string and construct terms from that:

```
julia> using GLM
julia> f = "y ~ x1 + x2"
"y ~ x1 + x2"
julia> y, xs = split(f, "~")
2-element Vector{SubString{String}}:
"y "
" x1 + x2"
julia> term(y) ~ sum(term.(split(xs, "+")))
FormulaTerm
Response:
y (unknown)
Predictors:
x1 (unknown)
x2(unknown)
```

Apart from that you’re basically looking to do what the `@formula`

macro does I suppose, and that’s what you’ve got already: StatsModels.jl/formula.jl at master · JuliaStats/StatsModels.jl · GitHub

1 Like

@dave.f.kleinschmidt Is there a way I can turn this `text`

into a `FormulaTerm`

?

```
text = "y ~ x + z"
StatsModels.terms!(StatsModels.sort_terms!(StatsModels.parse!(Meta.parse(text))))
:(($(Expr(:escape, :~)))(Term(:y), ($(Expr(:escape, :+)))(Term(:x), Term(:z))))
```

You can probably `eval`

what you’ve got there, but is it absolutely necessary to be working from a string? In general, if you’re trying to do something with a string that involves calling some function depending on the contents of the string, you’re not going to be able to do it without `eval`

somewhere, so you’re probably better off doing something like

```
@eval(@formula($(Meta.parse(text))))
```

(much as I hate to say it )

What’s going on here is that `Meta.parse`

is converting your string into a Julia `Expr`

, the `$(...)`

is inserting that into the expression starting with `@formula`

, and then `@eval`

is evaluating the whole thing. It’s basically as if you’d typed `@formula y ~ x + z`

into the REPL.

It’s generally a dangerous idea to use `@eval`

in scripts since it can lead to performance gotchas unless you’re VERY careful, but in this case I don’t see a way around it. The usual advice we give to people trying to construct a formula on the fly is to wrap their term symbols in `Term`

s and combine them with `+`

, `&`

, and `~`

, but if you have to be able to handle ANY formula that’s valid in R that won’t work (short of writing your own parser basically).

3 Likes