GLM.jl DataFrame not defined error

I’m getting a DataFrame not defined error when attempting to use the interval keyword argument in the predict function from GLM.jl.


using DataFrames, GLM

data = DataFrame(X=[1,2,3], Y=[2,4,7])
ols = lm(@formula(Y ~ X), data)
new_data = DataFrame(X=[5])

# Without the interval kwarg

julia> predict(ols, new_data)
1-element Array{Union{Missing, Float64},1}:

# With the interval kwarg

julia> predict(ols, new_data, interval=:prediction)
ERROR: UndefVarError: DataFrame not defined
 [1] _return_predictions(::NamedTuple{(:prediction, :lower, :upper),Tuple{Array{Float64,1},Array{Float64,2},Array{Float64,2}}}, ::BitArray{1}, ::Int64) at C:\Users\mthel\.julia\packages\StatsModels\Kz7By\src\statsmodel.jl:153
 [2] #predict#74(::Base.Iterators.Pairs{Symbol,Symbol,Tuple{Symbol},NamedTuple{(:interval,),Tuple{Symbol}}}, ::typeof(predict), ::StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}, ::DataFrame) at C:\Users\mthel\.julia\packages\StatsModels\Kz7By\src\statsmodel.jl:166
 [3] (::getfield(StatsBase, Symbol("#kw##predict")))(::NamedTuple{(:interval,),Tuple{Symbol}}, ::typeof(predict), ::StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}, ::DataFrame) at .\none:0
 [4] top-level scope at REPL[11]:1

Am I missing something or should I open an issue on GitHub?

Looks like a bug in StatsModels.jl to me. DataFrames is never imported by StatsModels. It’s listed in the Project.toml only as an extra testing dependency. A simple fix would be to add using DataFrames or import DataFrames.DataFrame to StatsModels.jl or statsmodel.jl and change the Project.toml accordingly. I don’t know enough about the intention behind the code to know whether that would be the best fix.

Yes this was a bug, fixed by (and released version 0.6.6)