Problem using GLM

The following is throwing an error:

using DataFrames, GLM
data = DataFrame(X=1:1:length(event[1]), Y=event[1])
ols = lm(@formula(Y ~ X), data)

The reason appears to be to do with the type of data I define as X and Y:

typeof(X)   StepRange{Int64,Int64}

typeof(Y)   Vector{Any}

What am I missing?

first you don’t need the middle :1, and if the issue is about GLM doesn’t like StepRange then you can do X=collect(1:length(event[1])) instead in the second line to materialize an array from this iterator

The problem seems to be deeper:

why does your array have a shape of 2? They need to be vectors ( Array{Float64,1})

Yea, what is typeof(event)? Your LHS variable needs to be a vector not a matrix. Somehow event[1] is producing a matrix for you. If it truly is a vector that is being misappropriated as a matrix then you can just wrap it in vec … vec(event[1])

1 Like

Just to add that you don’t have to collect the range when constructing the DataFrame, as this happens automatically:

julia> using GLM, DataFrames

julia> df = DataFrame(x = 1:10, y = 0.5 .* (1:10) .+ rand.())
10Γ—2 DataFrame
β”‚ Row β”‚ x     β”‚ y        β”‚
β”‚     β”‚ Int64 β”‚ Float64  β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 1     β”‚ 0.874401 β”‚
β”‚ 2   β”‚ 2     β”‚ 1.07089  β”‚
β”‚ 3   β”‚ 3     β”‚ 2.08023  β”‚
β”‚ 4   β”‚ 4     β”‚ 2.89661  β”‚
β”‚ 5   β”‚ 5     β”‚ 3.22647  β”‚
β”‚ 6   β”‚ 6     β”‚ 3.0337   β”‚
β”‚ 7   β”‚ 7     β”‚ 3.81535  β”‚
β”‚ 8   β”‚ 8     β”‚ 4.19433  β”‚
β”‚ 9   β”‚ 9     β”‚ 4.69071  β”‚
β”‚ 10  β”‚ 10    β”‚ 5.7383   β”‚

julia> df.x == collect(1:10)
true

julia> lm(@formula(y ~ x), df)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}

y ~ 1 + x

Coefficients:
────────────────────────────────────────────────────────────────────────────
             Estimate  Std. Error   t value  Pr(>|t|)   Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────────
(Intercept)  0.420504   0.220686    1.90544    0.0932  -0.0883982   0.929406
x            0.498472   0.0355667  14.0151     <1e-6    0.416455    0.580489
────────────────────────────────────────────────────────────────────────────
1 Like

My problem was solved by converting event[1] to a float:
convert(Array{Float64,1}, event[1])