MLJ for XGBoost - extracting feature gain

I have used MLJ for XGBoost in the past, and was able to explore the feature importances doing the following:

using MLJ, DataFrames
@load XGBoostRegressor pkg=XGBoost
X = DataFrame(rand((0, 1), (200, 5)), :auto)
y = rand([true, false], 200)


model = MLJXGBoostInterface.XGBoostRegressor()
xgb = machine(model, X, y) |> fit!
y_hat = MLJ.predict(xgb, X)

f = fitted_params(xgb)
r = report(xgb)

# the report had all the following fields I could access:
begin
    gains = [i.gain for i in r[1]]
    covers = [i.cover for i in r[1]]
    freqs = [i.freq for i in r[1]]
    feats = [i.fname for i in r[1]]
end

However, this no longer works because it’s no longer part of report(mach):

fieldnames(typeof(r))
> (:features,)

How can I now access the gain, cover and frequency of the different features?

PS: This is straightforward using the original XGBoost package, but I like using MLJ:

using XGBoost
b = xgboost((X, y))
c = importancereport(b)

@Ivan Thanks for reporting this.

In the last breaking release of MLJXGBoostInterface those particular access points were indeed removed. However, MLJ now has a generic feature_importance accessor function you can call on machines wrapping supported models, and the MLJXGBoostInterface models are now supported.

Unfortunately, I just discovered a minor bug, so that only the classifier is currently working. Here’s the workflow in that case:

using MLJ
XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost
X, y = @load_iris

model = XGBoostClassifier()
mach = machine(model, X, y) |> fit!

julia> feature_importances(mach)
4-element Vector{Pair{Symbol, Float32}}:
 :petal_length => 2.991818
  :petal_width => 1.3149351
  :sepal_width => 0.072732545
 :sepal_length => 0.042442977
1 Like

@Ivan When this is merged (next hour) then you can update MLJXGBoostInterface to correct the bug.

1 Like