Simple linear regression question

#1

Hello!
I apologize for possibly a simple question. I’m new to Julia, and maybe I didn’t find something in the description of StatsModels and TimeSeries.

I have DataFrame data2 with time (yyyy-mm-dd) and 2 column with data Y1 and Y2

julia> data2
26×3 DataFrame
│ Row │ Time       │ Y1  │ Y2  │
│     │ Date⍰      │ Float64 │ Float64 │
├─────┼────────────┼─────────┼─────────┤
│ 1   │ 2016-12-01 │ 72.0    │ 19.0    │
│ 2   │ 2017-01-01 │ 21.0    │ 63.0    │
...

And I want to make the approximation function:

Y1 = a*Time + b and Y2 = a*Time + b

How can i do this? Who can tell?
Thank you for the answers

1 Like
#2

Something like this?

using DataFrames, GLM

y₁ = [rand()*0.5i for i ∈ 1:1_000]
y₂ = [rand()*1.5i for i ∈ 1:1_000]

df = DataFrame(t = 1:1_000, y₁ = y₁, y₂ = y₂)

lm(@formula(y₁ ~ t), df)
lm(@formula(y₂ ~ t), df)
1 Like
#3

Hello @nilshg
Thank you for your response and example.

I try

julia> lm(@formula(Y1 ~ Time), data2)

and error

Error showing value of type StatsModels.DataFrameRegressionModel{LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}:
ERROR: ArgumentError: FDist: the condition ν1 > zero(ν1) && ν2 > zero(ν2) is not satisfied.
#4

I think you just need to convert your Date column to a numeric variable, e.g. using df.Time2 = datetime2rata.(df.Time). By default non-numeric variables are treated as categorical.