Have confusion regarding Linear Regression

Traditional linear regression, as was noted above, leads to a point estimator. If the errors are normally distributed, then the small sample distribution of the estimator will also be normal. With non-normal errors, the asymptotic distribution will still be normal. However, the small sample distribution will not be so. Sometimes, bootstrapping is used in this context to explore the small sample distribution. In that context, it could make sense to represent the bootstrap samples using a chain. Here’s a code example that does that:

using Plots, Distributions, Statistics

# simple iid bootstrap
function bootstrap(data)
    n = size(data,1)
    resampled = similar(data)
    for i = 1:n
        j = rand(1:n)
        resampled[i,:] = data[j,:]
    end
    return resampled
end    

n = 50
reps = 1000
x = [ones(n) randn(n)]
β = [2.,-1.]
ϵ = rand(Chisq(3.),n) .- 1.5
y = x*β + ϵ
data = [y x]
bs = zeros(reps,2)
for i = 1:reps
    d = bootstrap(data)
    bs[i,:] = d[:,2:end] \ d[:,1]
end
plot(bs[:,2], labels=false)
q05 = quantile(bs[:,2], 0.05)
q95 = quantile(bs[:,2], 0.95)
hline!([q05], labels="q05")
hline!([q95], labels="q95")
1 Like