formula = 'c1 ~ c2 + c3 +c4 +c2&c4'
import patsy
y, X = patsy.dmatrices(formula, df)
This would give me X, the design matrix of the regression–i.e. columns corresponding to c2, c3, c4, and c2.*c4.
How can I get this matrix in Julia? I could probably do this manually, but if I had a large number of regressors and interactions, it might get tricky, so I’d like a systematic approach.
The reason I’m asking is that I eventually want to do counterfactuals for what would happen if I change c2 to 0 or to 1. You could throw this in as a bonus, but I think once I get the first part, I can figure that out.