Suggestions for the design of Survey.jl?

Nice to hear that you are working on designing a Julian API for that nice package!

My general recommendation would be to try to move as much as possible from using svy-prefixed functions to using generic Julia functions dispatching on Survey.jl objects. For example, svyglm could be replaced with a special glm method when data is a survey design object.

I had also mentioned some ideas about replacing svyby with combine(groupby(...), ...) at https://github.com/xKDR/Survey.jl/issues/4.

One area where you can probably improve on the R package quite easily is that, at least for designs with replicate weights, thanks to the StatisticalModel/RegressionModel interface from StatsAPI, you could support fitting any custom model type defined in a package by calling fit on it with each set of weights and computing standard errors for coefficients automatically based on the coefficients obtained with each set. IIUC this would offer the features of svrepmisc, but also extend it automatically to any new model family, without Survey.jl having to support it explicitly. This is an area where R is often lacking (everything is hardcoded, making it hard to extend to new use cases).

2 Likes