Using Survey/Inverse Probability Weights in Regression


#1

Hi, I hope the answer is not too obvious as I am new to Julia.

I want to do some regressions with a weighted sample. At the moment I use the GLM package and I saw there are ProbabilityWeights, but I haven’t figured out how to use them together. Can anybody help?


#2

GLM offers limited support for weights at the moment. The basic usage goes like,

using DataFrames, GLM
srand(0)
df = DataFrame(y = rand(1:10, 10),
               x = rand(10),
               w = rand(1:10, 10))
glm(@formula(y ~ x), df, Normal(), IdentityLink()) # OLS
glm(@formula(y ~ x), df, Normal(), IdentityLink(),
    wts = float.(df[:w])) # WLS

#3

Thanks, it works smoothely.

Compared to R’s survey package I noticed smaller standard errors. Do you have any idea why?


#4

GLM interprets weights as analytic (a.k.a. inverse variance) weights, like R’s glm. If you pass it sampling (a.k.a. inverse probability) weights, you’ll get incorrect standard errors.