I’m trying to use a stacked model to predict an outcome, but I’m having an unusual performance issue. I’m not sure whether the problems are related to my own code having a mistake somewhere, or if it’s just to be expected, but the model seems to get stuck on fitting the Stack
ed model, which seems weird to me. I would expect the majority of the computation to be spent on fitting the tree and spline regressors, but instead, it seems to get stuck on trying to calculate the relatively simple linear regressor for stacking the models together:
tree = EvoTreeRegressor(nrounds=128, nbins=128, eta=0.02)
spline = EvoSplineRegressor(nrounds=32, eta=0.08, L2=.025)
knnr = KNNRegressor(K=6, weights=ReciprocalRank())
cont_kwargs = (
lower=exp(0), upper=exp(.5), scale=:log
)
ranges = [
mlj.range(tree, :max_depth; values=3:9),
mlj.range(tree, :lambda; cont_kwargs...),
]
# Tune tree hyperparameters
tree = mljt.TunedModel(
model=tree,
range=ranges,
tuning=mljt.Grid(),
resampling=mlj.CV(nfolds=12)
)
@views X_obs = data[:, Not([:Name, :LogWeight])]
# Fit the tuned tree in a machine
clean = mlj.ContinuousEncoder(drop_last=true) |>
mlj.Standardizer()
# clean = mlj.machine(clean, X_obs) |> mlj.fit!
# stck = mlj.transform(clean, X_obs)
stck = clean |> mlj.Stack(; metalearner=LinearRegressor(), tree, knnr, spline)
mach = mlj.machine(stck, X_obs, data.LogWeight) |> mlj.fit!