MethodError: no method matching DMatrix

Hi All,

After fixing my one-hot encoding issue I’ve split my code into test and train but I am struggling to get XGBoost’s DMatrix to run. All of my variables are numeric (with the exception of my label, which I did test as numeric to see if that was the issue and I got the same error. At least in R the label needed to be categorical when making a DMatrix).

#train and test set
function partitionTrainTest(data, at = 0.7)
    n = nrow(data)
    idx = shuffle(1:n)
    train_idx = view(idx, 1:floor(Int, at*n))
    test_idx = view(idx, (floor(Int, at*n)+1):n)
    data[train_idx,:], data[test_idx,:]
end

train,test = partitionTrainTest(df, 0.7) # 70% train

train_X = train[:, Not(1)]
train_Y = train[:, 1]
test_X = test[:, Not(1)]
test_Y = test[:, 1]
#train_Y = DataFrame(label = train_Y)
#test_Y = DataFrame(label = test_Y)

dtrain = DMatrix(train_X , label = train_Y)

I’ve tried the label as a dataframe or an array neither work and provide the same error.

dtrain = DMatrix(train_X, label = train_Y)
ERROR: MethodError: no method matching DMatrix(::DataFrame; label=CategoricalArrays.CategoricalValue{Float64, UInt32}[0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0])

I assume I need to somehow turn the dataframe into a matrix but I am not sure exactly how…

Here I figured out how to turn this into a matrix but I still end up with an error:

train_X = Matrix{Union{Missing, Float32}}(train[:, Not(1)])
train_Y = convert(Array{Float32}, train[:, :Business_Type])

test_X = Matrix{Union{Missing, Float32}}(test[:, Not(1)])
test_Y = convert(Array{Float32}, test[:, :Business_Type])

# DMatrix
dtrain = DMatrix(train_X, label = train_Y)

And here is the error:

ERROR: MethodError: no method matching DMatrix(::Matrix{Union{Missing, Float64}}; label=Float32[1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0  …  0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0])
Closest candidates are:
  DMatrix(::Matrix{var"#s56"} where var"#s56"<:Real)

I’ve also tried Float32 and Float64 and neither work (same error)…

train_X = Array{Union{Missing, Float64}}(train[:, Not(1)]) #448193×34 Matrix{Union{Missing, Float64}}
train_Y = Array{Union{Missing, Float64}}(train[:, :Business_Type]) #448193-element Vector{Union{Missing, Float64}}

Error:

ERROR: MethodError: no method matching DMatrix(::Matrix{Union{Missing, Float64}}; label=Union{Missing, Float64}[0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0  …  0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0])Closest candidates are:
  DMatrix(::Matrix{var"#s56"} where var"#s56"<:Real)

Finally solved it with digging into the DMatrix!

   function DMatrix(data::Matrix{<:Real}, transposed::Bool = false, missing = NaN32;
                              kwargs...)
        handle = nothing
        if !transposed
            handle = XGDMatrixCreateFromMat(convert(Matrix{Float32}, data),
                                            convert(Float32, missing))
        else
            handle = XGDMatrixCreateFromMatT(convert(Matrix{Float32}, data),
                                             convert(Float32, missing))
        end

The key part in the underlying function is that it was looking for REAL instead of Float32 - which is not what I’ve seen in the few examples online such as:

testArray = convert(Array{Float32}, test[:, [:annual_usage, :bracket_pricing, :min_order_quantity, :quantity, :year, :month, :day, :dayWeek, :supplier]])

Which is from Starter Code

which doesn’t work any longer!