Hello!
I made a simple XGBoost model with the pipeline below, fitted it with data and tried to save it and then restore it, but when restoring it (even if in the same notebook) I get error on trying to apply predict. Any idea on how to solve it?
XGBC = @load XGBoostClassifier
xgb = XGBC()
ohe = OneHotEncoder()
# Pipeline OneHotEncoder > XGBoost
xgb_pipe = ohe |> xgb
# Setting Target and Features tables:
y, X = unpack(df, ==(:y_label), col->true)
train, test = partition(1:length(y), 0.1, shuffle=true)
xgbm = machine(xgb_pipe, X, y, cache=false)
fit!(xgbm, rows=train, verbosity=0)
MLJ.save("mach_xgb_pipe.jls", xgbm)
# Restoring the model and using for predictions:
mach_restored = machine("mach_xgb_pipe.jls")
yhat = predict_mode(mach_restored , X[test,:])
Error message:
Error: Failed to apply the operation `predict` to the machine machine(:xg_boost_classifier, …), which receives it's data arguments from one or more nodes in a learning network. Possibly, one of these nodes is delivering data that is incompatible with the machine's model.
│ Model (xg_boost_classifier):
│ input_scitype = Unknown
│ target_scitype =Unknown
│ output_scitype =Unknown
│
│ Incoming data:
│ arg of predict scitype
│ -------------------------------------------
│ Node @818 → :one_hot_encoder Table{AbstractVector{Continuous}}
│
│ Learning network sources:
│ source scitype
│ -------------------------------------------
│ Source @791 Table{Union{AbstractVector{Continuous}, AbstractVector{Multiclass{10}}, AbstractVector{Multiclass{2}}, AbstractVector{Multiclass{89}}, AbstractVector{Multiclass{6}}}}
│ Source @496 AbstractVector{OrderedFactor{2}}
└ @ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\composition\learning_networks\nodes.jl:153
XGBoostError: (caller: XGBoosterPredictFromDMatrix)
[14:39:55] /workspace/srcdir/xgboost/src/c_api/c_api.cc:1059: Booster has not been initialized or has already been disposed.
Stacktrace:
[1] _apply(y_plus::Tuple{Node{Machine{Symbol, true}}, Machine{Symbol, true}}, input::DataFrame; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\composition\learning_networks\nodes.jl:159
[2] _apply
@ C:\Users\User\.julia\packages\MLJBase\mIaqI\src\composition\learning_networks\nodes.jl:144 [inlined]
[3] (::Node{Machine{Symbol, true}})(Xnew::DataFrame)
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\composition\learning_networks\nodes.jl:140
[4] output_and_report(signature::MLJBase.Signature{NamedTuple{(:predict, :transform), Tuple{Node{Machine{Symbol, true}}, Node{Machine{Symbol, true}}}}}, operation::Symbol, Xnew::DataFrame)
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\composition\learning_networks\signatures.jl:374
[5] predict(model::MLJBase.ProbabilisticPipeline{NamedTuple{(:one_hot_encoder, :xg_boost_classifier), Tuple{Unsupervised, Probabilistic}}, MLJModelInterface.predict}, fitresult::MLJBase.Signature{NamedTuple{(:predict, :transform), Tuple{Node{Machine{Symbol, true}}, Node{Machine{Symbol, true}}}}}, Xnew::DataFrame)
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\operations.jl:191
[6] predict(mach::Machine{MLJBase.ProbabilisticPipeline{NamedTuple{(:one_hot_encoder, :xg_boost_classifier), Tuple{Unsupervised, Probabilistic}}, MLJModelInterface.predict}, false}, Xraw::DataFrame)
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\operations.jl:133
[7] predict
@ C:\Users\User\.julia\packages\MLJTuning\drqMP\src\tuned_models.jl:795 [inlined]
[8] predict_mode(m::MLJTuning.ProbabilisticTunedModel{Grid, MLJBase.ProbabilisticPipeline{NamedTuple{(:one_hot_encoder, :xg_boost_classifier), Tuple{Unsupervised, Probabilistic}}, MLJModelInterface.predict}}, fitresult::Machine{MLJBase.ProbabilisticPipeline{NamedTuple{(:one_hot_encoder, :xg_boost_classifier), Tuple{Unsupervised, Probabilistic}}, MLJModelInterface.predict}, false}, Xnew::DataFrame)
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\interface\model_api.jl:11
[9] predict_mode(mach::Machine{MLJTuning.ProbabilisticTunedModel{Grid, MLJBase.ProbabilisticPipeline{NamedTuple{(:one_hot_encoder, :xg_boost_classifier), Tuple{Unsupervised, Probabilistic}}, MLJModelInterface.predict}}, false}, Xraw::DataFrame)
@ MLJBase C:\Users\User\.julia\packages\MLJBase\mIaqI\src\operations.jl:133
[10] top-level scope
@ In[113]:5
Side note: the model does work when applied directly to “predict”/“predict_mode”.
I also tried, without success:
using JLSO
smach = serializable(mach_tuned_xgb_pipe)
JLSO.save("machine_serialized.jlso", :machine => smach)
loaded_mach = JLSO.load("machine_serialized.jlso")[:machine]
restore!(loaded_mach)
yhat = predict_mode(loaded_mach, X[test,:])
Thanks for helping!