[ANN] EvoTrees.jl v0.18 release

This work behind the latest EvoTrees.jl release was mainly from SciML’s small grant project to improve GPU performance: Improve training performance of GPU backend Project Extension by AdityaPandeyCN · Pull Request #174 · SciML/sciml.ai · GitHub

Refactor of GPU training backend

  • Computations are now alsmost entirely done through KernelAbstractions.jl. Objective is to eventually have full support for AMD / ROCm in addition to current NVIDIA / CUDA devices.
  • Important performance increase, notably for larger max depth. Training time is now closely increase linearly with depth.

Breaking change: improved reproducibility

  • Training returns exactly the same fitted model for a given learner (ex: EvoTreeRegressor).
  • Reproducibility is respected for both cpu and gpu. However, thes result may differ between cpu and gpu. Ie: reproducibility is guaranteed only within the same device type.
  • The learner / model constructor (ex: EvoTreeRegressor) now has a seed::Int argument to set the random seed. Legacy rng kwarg will now be ignored.
  • The internal random generator is now Xishiro (was previously MersenneTwister with rng::Int).

Added node weight information in fitted trees

  • The train weight reaching each of the split/leaf nodes is now stored in the fitted trees. This is accessible via model.trees[i].w for the i-th tree in the fitted model. This is notably inteded to support SHAP value computations.
11 Likes

Very cool! It looks like EvoTrees now consistently beats XGBoost. Do you know how it compares to CatBoost for speed/OOTB accuracy? My impression is that these days CatBoost is the gold standard for boosted decision trees.

I’m hoping I’ll soon be able to actually follow through on my project that could benefit from EvoTrees. The improved reproducibility will be a help there, and TreeSHAP would be very nice to have :slightly_smiling_face:

2 Likes

I’ve seen conflicting evidence, https://arxiv.org/pdf/2408.14817v1 showes the biggest lead of CatBoost over XGB (Table 4)

But there are also benchmarks that look like they’re about the same:

1 Like

I’ve maintained some basic tabular benchmarks here: GitHub - Evovest/MLBenchmarks.jl: ML models benchmarks on public dataset
While I’m aware of good praise for CatBoost, I haven’t seen it outperform on my problems of interest. It can also depends to which extent the hyper-params were properly tuned. XGB, lightGBM and CatBoost can all be of interest; though they remain very similar algos.
Note that oblivious trees is supported, but I’ve only see it underperformaed compared to default binary mode.

To be seen for TreeShap timing. An external contributor has been looking at it. We may push to complete that feature in case he may not be able to complete.

3 Likes

Interesting. The thing I find most striking about https://arxiv.org/pdf/2506.16791 isn’t actually the CatBoost performance (which is reported favorably), but how close the untuned and tuned performance is:

This does line up with the impression I have of CatBoost doing a particularly job with defaults.

Interesting. It does seem rather problem dependent. I notice that in your benchmarks it’s broadly equivalent, with the exception of Boston where the MSE seems markedly improved for CatBoost.

Interesting, these seems similar to the question I just asked over on GitHub (about ordered boosting).

:heart:

2 Likes