MLJ for XGBoost - extracting feature gain

I have used MLJ for XGBoost in the past, and was able to explore the feature importances doing the following:

using MLJ, DataFrames
@load XGBoostRegressor pkg=XGBoost
X = DataFrame(rand((0, 1), (200, 5)), :auto)
y = rand([true, false], 200)


model = MLJXGBoostInterface.XGBoostRegressor()
xgb = machine(model, X, y) |> fit!
y_hat = MLJ.predict(xgb, X)

f = fitted_params(xgb)
r = report(xgb)

# the report had all the following fields I could access:
begin
    gains = [i.gain for i in r[1]]
    covers = [i.cover for i in r[1]]
    freqs = [i.freq for i in r[1]]
    feats = [i.fname for i in r[1]]
end

However, this no longer works because it’s no longer part of report(mach):

fieldnames(typeof(r))
> (:features,)

How can I now access the gain, cover and frequency of the different features?

PS: This is straightforward using the original XGBoost package, but I like using MLJ:

using XGBoost
b = xgboost((X, y))
c = importancereport(b)

@Ivan Thanks for reporting this.

In the last breaking release of MLJXGBoostInterface those particular access points were indeed removed. However, MLJ now has a generic feature_importance accessor function you can call on machines wrapping supported models, and the MLJXGBoostInterface models are now supported.

Unfortunately, I just discovered a minor bug, so that only the classifier is currently working. Here’s the workflow in that case:

using MLJ
XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost
X, y = @load_iris

model = XGBoostClassifier()
mach = machine(model, X, y) |> fit!

julia> feature_importances(mach)
4-element Vector{Pair{Symbol, Float32}}:
 :petal_length => 2.991818
  :petal_width => 1.3149351
  :sepal_width => 0.072732545
 :sepal_length => 0.042442977
1 Like

@Ivan When this is merged (next hour) then you can update MLJXGBoostInterface to correct the bug.

1 Like

Sorry for reopening this… It seems like the model XGBoostClassifier is not supported anymore by feature_importances (calling feature_importances(mach) on the MWE of the solution of this post returns Nothing). What is the current recommended way of accessing this?

Also, how do I keep myself updated on these changes? I feel like I’m missing something obvious here!

Mmm. That’s strange. The code in my previous comment is working fine for me with latest versions:

(jl_62DF8o) pkg> st
Status `/private/var/folders/4n/gvbmlhdc8xj973001s6vdyw00000gq/T/jl_62DF8o/Project.toml`
  [a7f614a8] MLJBase v1.7.0
  [54119dfa] MLJXGBoostInterface v0.3.11

Can I suggest you open an issue at MLJXGBoostInterface.jl, with a minimum working example and output of using Pkg; Pkg.status(). Thanks.