I have used MLJ for XGBoost in the past, and was able to explore the feature importances doing the following:
using MLJ, DataFrames
@load XGBoostRegressor pkg=XGBoost
X = DataFrame(rand((0, 1), (200, 5)), :auto)
y = rand([true, false], 200)
model = MLJXGBoostInterface.XGBoostRegressor()
xgb = machine(model, X, y) |> fit!
y_hat = MLJ.predict(xgb, X)
f = fitted_params(xgb)
r = report(xgb)
# the report had all the following fields I could access:
begin
gains = [i.gain for i in r[1]]
covers = [i.cover for i in r[1]]
freqs = [i.freq for i in r[1]]
feats = [i.fname for i in r[1]]
end
However, this no longer works because it’s no longer part of report(mach):
fieldnames(typeof(r))
> (:features,)
How can I now access the gain, cover and frequency of the different features?
PS: This is straightforward using the original XGBoost package, but I like using MLJ:
using XGBoost
b = xgboost((X, y))
c = importancereport(b)
In the last breaking release of MLJXGBoostInterface those particular access points were indeed removed. However, MLJ now has a generic feature_importance accessor function you can call on machines wrapping supported models, and the MLJXGBoostInterface models are now supported.
Unfortunately, I just discovered a minor bug, so that only the classifier is currently working. Here’s the workflow in that case:
Sorry for reopening this… It seems like the model XGBoostClassifier is not supported anymore by feature_importances (calling feature_importances(mach) on the MWE of the solution of this post returns Nothing). What is the current recommended way of accessing this?
Also, how do I keep myself updated on these changes? I feel like I’m missing something obvious here!