Hi, I’m trying out MLJ to do decision tree regression where the inputs are categorical that are either string or ints. Below is the schema of the input (origin and destination are h3 hashes).
┌─────────────┬───────────────────┬────────────────────────────────────┐
│ names │ scitypes │ types │
├─────────────┼───────────────────┼────────────────────────────────────┤
│ dow │ OrderedFactor{7} │ CategoricalValue{Int64, UInt32} │
│ start_month │ OrderedFactor{2} │ CategoricalValue{Int64, UInt32} │
│ start_hour │ OrderedFactor{24} │ CategoricalValue{Int64, UInt32} │
│ origin │ Multiclass{8} │ CategoricalValue{String15, UInt32} │
│ destination │ Multiclass{7} │ CategoricalValue{String15, UInt32} │
└─────────────┴───────────────────┴────────────────────────────────────┘
but when I run evaluate on it evaluate(DecisionTreeRegressor(), X, y)
I get the following error:
┌ Error: Problem fitting the machine machine(DecisionTreeRegressor(max_depth = 0, …), …).
└ @ MLJBase C:\Users\test\.julia\packages\MLJBase\fEiP2\src\machines.jl:682
[ Info: Running type checks...
[ Info: Type checks okay.
MethodError: Cannot `convert` an object of type
CategoricalArrays.CategoricalValue{Int64,UInt32{}} to an object of type
CategoricalArrays.CategoricalValue{Union{Int64, String15},UInt32{}}
Closest candidates are:
convert(::Type{T}, ::T) where T
@ Base Base.jl:64
(::Type{T})(::T) where T<:CategoricalArrays.CategoricalValue
@ CategoricalArrays C:\Users\test\.julia\packages\CategoricalArrays\0yLZN\src\value.jl:95
(::Type{CategoricalArrays.CategoricalValue{T, R}} where {T<:Union{AbstractChar, AbstractString, Number}, R<:Integer})(::Any, ::Any)
@ CategoricalArrays C:\Users\test\.julia\packages\CategoricalArrays\0yLZN\src\typedefs.jl:80
It seems that because I have columns that are strings and columns that are Ints, MLJ fuses them into Union{String,Int64}
but for some reason, CategoricalArrays.CategoricalValue{Int64, ...}
cannot be converted to CategoricalArrays.CategoricalValue{Union{Int64,String}, ...}
. In fact, I redid a minimal sample and confirmed the behavior:
julia> convert(CategoricalArrays.CategoricalValue{Union{String,Int64}, UInt32}, X.dow[1])
ERROR: MethodError: Cannot `convert` an object of type
CategoricalArrays.CategoricalValue{Int64,UInt32{}} to an object of type
CategoricalArrays.CategoricalValue{Union{Int64, String},UInt32{}}
Any advice on what should I be doing?
I’m using Julia 1.9.3 and MLJ is version v0.20.1.