Hi

I’m trying to perform linear regression on a small dataset (these are just sample data, the real dataset will be much larger), but I’m running into a problem with the matrix not being positive definite, which it very well might be, but I wouldn’t know since I have no idea what that means However, if I perform linear regression in R, it does not complain. So how can I work around this in Julia?

Julia:

```
julia> using GLM
julia> using CSV
julia> d = CSV.read("test.csv")
10×6 DataFrame
│ Row │ x1 │ x2 │ x3 │ x4 │ x5 │ y │
├─────┼───────┼─────────┼─────────┼────┼───────┼─────────┤
│ 1 │ 187.7 │ 84.1768 │ 82.4464 │ 10 │ 76.81 │ 2714.95 │
│ 2 │ 187.7 │ 87.5767 │ 84.3167 │ 12 │ 76.81 │ 2754.11 │
│ 3 │ 187.7 │ 91.9267 │ 86.0667 │ 14 │ 76.81 │ 4480.88 │
│ 4 │ 187.7 │ 91.5833 │ 89.1167 │ 16 │ 76.81 │ 4760.54 │
│ 5 │ 187.7 │ 86.915 │ 92.35 │ 18 │ 76.81 │ 5205.43 │
│ 6 │ 178.4 │ 106.757 │ 81.1379 │ 10 │ 78.97 │ 2598.67 │
│ 7 │ 178.4 │ 108.783 │ 82.9833 │ 12 │ 78.97 │ 4295.96 │
│ 8 │ 178.4 │ 103.832 │ 86.5667 │ 14 │ 78.97 │ 4647.4 │
│ 9 │ 178.4 │ 96.5583 │ 90.2203 │ 16 │ 78.97 │ 4300.89 │
│ 10 │ 178.4 │ 93.7767 │ 92.9333 │ 18 │ 78.97 │ 4254.61 │
julia> lm(@formula(y ~ x1 + x2 + x3 + x4 + x5), d)
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
...
```

R:

```
> d = read.csv("test.csv")
> lm(y ~ x1 + x2 + x3 + x4 + x5, d)
Call:
lm(formula = y ~ x1 + x2 + x3 + x4 + x5, data = d)
Coefficients:
(Intercept) x1 x2 x3 x4 x5
-9234.73 91.71 64.45 -206.79 590.43 NA
```