Hello, everybody.

I have encountered doing multiple regression using GLM.

Here’s my dataset `data`

(originally come from https://www.sciencedirect.com/science/article/pii/S1359511305001121)

x1 | x2 | x3 | x4 | x5 | b | y |
---|---|---|---|---|---|---|

-1 | -1 | -1 | -1 | -1 | 1 | 9.137 |

1 | -1 | -1 | -1 | -1 | 1 | 11.996 |

-1 | 1 | -1 | -1 | -1 | 1 | 8.887 |

1 | 1 | -1 | -1 | -1 | 1 | 12.405 |

-1 | -1 | 1 | -1 | -1 | 1 | 11.274 |

1 | -1 | 1 | -1 | -1 | 1 | 10.511 |

-1 | 1 | 1 | -1 | -1 | 1 | 11.113 |

1 | 1 | 1 | -1 | -1 | 1 | 11.764 |

-1 | -1 | -1 | 1 | -1 | 1 | 9.482 |

1 | -1 | -1 | 1 | -1 | 1 | 10.683 |

-1 | 1 | -1 | 1 | -1 | 1 | 8.277 |

1 | 1 | -1 | 1 | -1 | 1 | 11.243 |

-1 | -1 | 1 | 1 | -1 | 1 | 8.188 |

1 | -1 | 1 | 1 | -1 | 1 | 10.219 |

-1 | 1 | 1 | 1 | -1 | 1 | 8.88 |

1 | 1 | 1 | 1 | -1 | 1 | 9.309 |

-1 | -1 | -1 | -1 | 1 | 1 | 10.428 |

1 | -1 | -1 | -1 | 1 | 1 | 11.118 |

-1 | 1 | -1 | -1 | 1 | 1 | 11.242 |

1 | 1 | -1 | -1 | 1 | 1 | 11.636 |

-1 | -1 | 1 | -1 | 1 | 1 | 8.448 |

1 | -1 | 1 | -1 | 1 | 1 | 11.078 |

-1 | 1 | 1 | -1 | 1 | 1 | 11.285 |

1 | 1 | 1 | -1 | 1 | 1 | 11.976 |

-1 | -1 | -1 | 1 | 1 | 1 | 8.233 |

1 | -1 | -1 | 1 | 1 | 1 | 8.537 |

-1 | 1 | -1 | 1 | 1 | 1 | 9.014 |

1 | 1 | -1 | 1 | 1 | 1 | 7.791 |

-1 | -1 | 1 | 1 | 1 | 1 | 7.022 |

1 | -1 | 1 | 1 | 1 | 1 | 9.74 |

-1 | 1 | 1 | 1 | 1 | 1 | 9.051 |

1 | 1 | 1 | 1 | 1 | 1 | 8.709 |

0 | 0 | 0 | 0 | 0 | 1 | 11.249 |

0 | 0 | 0 | 0 | 0 | 1 | 9.827 |

0 | 0 | 0 | 0 | 0 | 1 | 10.43 |

0 | 0 | 0 | 0 | 0 | 1 | 11.726 |

0 | 0 | 0 | 0 | 0 | 1 | 10.689 |

0 | 0 | 0 | 0 | 0 | 1 | 11.033 |

0 | 0 | 0 | 0 | 0 | 1 | 9.57 |

0 | 0 | 0 | 0 | 0 | 1 | 11.596 |

-2 | 0 | 0 | 0 | 0 | 2 | 6.288 |

2 | 0 | 0 | 0 | 0 | 2 | 11.463 |

0 | -2 | 0 | 0 | 0 | 2 | 6.809 |

0 | 2 | 0 | 0 | 0 | 2 | 11.982 |

0 | 0 | -2 | 0 | 0 | 2 | 8.485 |

0 | 0 | 2 | 0 | 0 | 2 | 8.105 |

0 | 0 | 0 | -2 | 0 | 2 | 11.25 |

0 | 0 | 0 | 2 | 0 | 2 | 6.294 |

0 | 0 | 0 | 0 | -2 | 2 | 9.742 |

0 | 0 | 0 | 0 | 2 | 2 | 10.389 |

0 | 0 | 0 | 0 | 0 | 2 | 11.115 |

0 | 0 | 0 | 0 | 0 | 2 | 9.132 |

0 | 0 | 0 | 0 | 0 | 2 | 11.492 |

0 | 0 | 0 | 0 | 0 | 2 | 10.861 |

I tried but failed to fit the data to the model including first-order, second-order, and interaction terms (b is a block term).

```
julia> using GLM
julia> fm = @formula(y ~ 1 + b + (x1 + x2 + x3 + x4 + x5)*(x1 + x2 + x3 + x4 + x5))
Formula: y ~ 1 + b + x1 + x2 + x3 + x4 + x5 + x1 & x1 + x1 & x2 + x1 & x3 + x1 & x4 + x1 & x5 + x2 & x2 + x2 & x3 + x2 & x4 + x2 & x5 + x3 & x3 + x3 & x4 + x3 & x5 + x4 & x4 + x4 & x5 + x5 & x5
julia> ols = lm(fm, data)
PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
[1] checkpositivedefinite at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\factorization.jl:11 [inlined]
[2] #cholesky!#92(::Bool, ::Function, ::LinearAlgebra.Hermitian{Float64,Array{Float64,2}}, ::Val{false}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\cholesky.jl:153
[3] cholesky! at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\cholesky.jl:152 [inlined] (repeats 2 times)
[4] DensePredChol(::Array{Float64,2}, ::Bool) at D:\Apps\juliapkg\packages\GLM\0c65q\src\linpred.jl:107
[5] cholpred at D:\Apps\juliapkg\packages\GLM\0c65q\src\linpred.jl:117 [inlined]
[6] fit(::Type{LinearModel}, ::Array{Float64,2}, ::Array{Float64,1}, ::Bool) at D:\Apps\juliapkg\packages\GLM\0c65q\src\lm.jl:136
[7] #fit#36(::Dict{Any,Any}, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{LinearModel}, ::Formula, ::DataFrames.DataFrame, ::Bool) at D:\Apps\juliapkg\packages\StatsModels\AYB2E\src\statsmodel.jl:72
[8] fit at D:\Apps\juliapkg\packages\StatsModels\AYB2E\src\statsmodel.jl:66 [inlined]
[9] lm at D:\Apps\juliapkg\packages\GLM\0c65q\src\lm.jl:146 [inlined] (repeats 2 times)
[10] top-level scope at In[33]:1
```

Using the same data, MATLAB does result in a fitting.

```
>> ols = fitlm(data, 'y ~ b + (x1+x2+x3+x4+x5)^2')
ols =
Linear regression model:
y ~ 1 + b + x1*x2 + x1*x3 + x2*x3 + x1*x4 + x2*x4 + x3*x4 + x1*x5 + x2*x5 + x3*x5 + x4*x5 + x1^2 + x2^2 + x3^2 + x4^2 + x5^2
Estimated Coefficients:
Estimate SE tStat pValue
_________ _______ ________ __________
(Intercept) 11.804 0.59186 19.944 1.2949e-19
x1 0.7276 0.17085 4.2586 0.00016848
x2 0.42085 0.17085 2.4632 0.019334
x3 -0.05755 0.17085 -0.33684 0.73844
x4 -0.9958 0.17085 -5.8284 1.785e-06
x5 -0.16915 0.17085 -0.99003 0.32959
b -0.83382 0.34642 -2.407 0.02203
x1:x2 -0.14331 0.19102 -0.75025 0.45859
x1:x3 -0.08325 0.19102 -0.43582 0.6659
x2:x3 0.14769 0.19102 0.77315 0.44511
x1:x4 -0.080813 0.19102 -0.42306 0.67508
x2:x4 -0.19213 0.19102 -1.0058 0.32207
x3:x4 -0.085688 0.19102 -0.44858 0.65676
x1:x5 -0.21969 0.19102 -1.1501 0.25864
x2:x5 0.1785 0.19102 0.93445 0.35707
x3:x5 0.0050625 0.19102 0.026502 0.97902
x4:x5 -0.25963 0.19102 -1.3591 0.1836
x1^2 -0.26392 0.18673 -1.4134 0.16719
x2^2 -0.13392 0.18673 -0.71719 0.47846
x3^2 -0.40904 0.18673 -2.1906 0.035881
x4^2 -0.28979 0.18673 -1.552 0.13051
x5^2 0.033581 0.18673 0.17984 0.85841
Number of observations: 54, Error degrees of freedom: 32
Root Mean Squared Error: 1.08
R-squared: 0.715, Adjusted R-Squared 0.528
F-statistic vs. constant model: 3.82, p-value = 0.000332
```

Did I do something wrong?