How do I tune a pipeline in MLJ?

ablaom · March 4, 2024, 8:07pm

A slack user has asked the question in the title. More generally, how does one tune hyperparameters that are nested in composed model?

One obtains composed models, for example, when applying Stack, EnsembleModel, IteratedModel, BinaryThresholdPredictor, BalancedModel, TunedModel and other model “wrappers”.

ablaom · March 4, 2024, 8:08pm

Here’s an example addressing the question:

Pkg.activate(temp=true)
Pkg.add(["MLJ", "MLJXGBoostInterface"])
using MLJ

X, y = @load_reduced_ames;

# notice `X` has mixed feature types:
schema(X)

# vertically split data:
(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, multi=true)

XGBoostRegressor = @load XGBoostRegressor
pipe = ContinuousEncoder() |> XGBoostRegressor()

propertynames(pipe)
# (:continuous_encoder, :xg_boost_regressor, :cache)

# range for a nested hyperparameter:
r = range(pipe, :(xg_boost_regressor.max_depth), lower=3, upper=10)

# self-tuning pipeline:
tmodel = TunedModel(
    pipe,
    resampling=CV(nfolds=5),
    tuning=Grid(resolution=10),
    measure = l2,
    range=r,
)

# training:
mach = machine(tmodel, Xtrain, ytrain) |> fit!

# inspect optimal parameter:
best_pipe = report(mach).best_model
best_pipe.xg_boost_regressor.max_depth
# 5

# predict:
predict(mach, Xtest)

Topic		Replies	Views
`MLJ.Stack` performance issues Machine Learning mlj	4	449	October 23, 2023
MLJ Tuning and Hyperparameters , Regression Performance optimization , machine-learning , mlj	0	278	November 20, 2022
Defining MLJ pipelines within a function Machine Learning	3	503	May 5, 2021
MLJ Tuning, MLJ Machine Learning optimization , regression , mlj , speed-optimization	0	391	November 20, 2022
Question Regarding leveraging MLJ.jl's CV features for my own Machine Learning	5	486	November 25, 2019

How do I tune a pipeline in MLJ?

Related topics