Using Survey/Inverse Probability Weights in Regression


Hi, I hope the answer is not too obvious as I am new to Julia.

I want to do some regressions with a weighted sample. At the moment I use the GLM package and I saw there are ProbabilityWeights, but I haven’t figured out how to use them together. Can anybody help?


GLM offers limited support for weights at the moment. The basic usage goes like,

using DataFrames, GLM
df = DataFrame(y = rand(1:10, 10),
               x = rand(10),
               w = rand(1:10, 10))
glm(@formula(y ~ x), df, Normal(), IdentityLink()) # OLS
glm(@formula(y ~ x), df, Normal(), IdentityLink(),
    wts = float.(df[:w])) # WLS


Thanks, it works smoothely.

Compared to R’s survey package I noticed smaller standard errors. Do you have any idea why?


GLM interprets weights as analytic (a.k.a. inverse variance) weights, like R’s glm. If you pass it sampling (a.k.a. inverse probability) weights, you’ll get incorrect standard errors.