Anyone has an implementation of generic boosting?

Tomas_Pevny · February 6, 2021, 11:37am

Dear All,

I would like to ask, as the title suggest, if anyone has a general implementation of boosting algorithm? I did some search and found that it is usually tightly coupled with a base learners being decision trees. But Boosting is a general meta-algorithm which assumes that the underlying base learner can fit possibly weighted samples and perform prediction on them.

I have recently started to wonder, why general people believes that NNs sucks on tabular datasets and Boosted decision trees shines. I came to conclusion that boosting might be the culprit, since single tree sucks as well. Since I would like to know, if I am right or wrong, I would like to test (and also would like to be right).

Due to my lack of time (you can also read it laziness), I would like to ideally hook already existing implementation (my six years old implementation is in matlab).

Well, thanks for answers and opinions on the matter of learning tabular data.

Tomas

tlienart · February 6, 2021, 11:55am

There’s two unmaintained libs that might provide you with a decent starting point:

GitHub - svs14/GradientBoost.jl: Gradient boosting framework for Julia.
GitHub - rakeshvar/AnyBoost.jl: A julia based machine learning package for boosting any loss, activation and constraint.

I don’t have experience with either so ymmv

PS: I think your assertion that “NNs suck on tabular datasets” is not up to date. Packages like e.g. AutoGluonTabular seem to suggest otherwise (though it blends NNs with other things).

Tomas_Pevny · February 6, 2021, 12:09pm

Thanks for links and correction of my knowlege. I was hoping that someone will point me to updated state of the art.

Tomas_Pevny · February 6, 2021, 1:24pm

So I read the AutoGluonTabular, and it is not a model based purely on Neural Networks, but they use whatever model scikit learn offers, and an ensembling strategy seems to be a very important part of the solution.

Thanks a lot @tlienart for pointing me to this direction (more pointers are welcomed).

Tomas_Pevny · February 6, 2021, 6:20pm

I have fixed the GradientBoost, such that tests (almost) pass on 1.6. The only trouble is clashing of fit! and predict which I do not know, where they are defined.
The fixed library is here
https://github.com/pevnak/GradientBoost.jl

I will try to contact the owner.

tlienart · May 17, 2022, 12:48pm

I cloned your repo at d7fe4df to see if I could help with fit and predict but when trying to run the tests, most do not pass (with errors like Util or GBBaseLearner or ML not defined). Are you working on a separate branch?

Summary

fwiw I’m on 1.7 but I doubt that changes much here.

(GradientBoost) pkg> status
     Project GradientBoost v0.1.0
      Status `~/Desktop/tjd/GradientBoost.jl/Project.toml`
  [864edb3b] DataStructures v0.18.12
  [38e38edf] GLM v1.7.0
  [7f8f8fb0] LearnBase v0.4.1
  [30fc2ffe] LossFunctions v0.7.2
  [9920b226] MLDataPattern v0.5.5
  [872c559c] NNlib v0.8.5
  [429524aa] Optim v1.7.0
  [f2b01f46] Roots v2.0.1
  [a759f4b9] TimerOutputs v0.5.19
  [9a3f8284] Random
  [10745b16] Statistics

misha_mikhasenko · December 16, 2024, 11:32am

Hi guys, @tlienart, @Tomas_Pevny.
How far did you go with resurrection of the GBDT?

The DecisionTree.jl is great, but I’m so much surprised no finding general implementation of the boosted trees in Julia

Tomas_Pevny · December 16, 2024, 1:34pm

I sort of hacked the original implementation for my purposes, but always got tired. But to be honest, I gut it out, because I do not care about using GBDT, I wanted to Boost neural networks.

Topic		Replies	Views
AnyBoost.jl Package Announcements machine-learning	1	819	June 27, 2019
New machine learning package, Julia implementation of XGBoost Machine Learning	32	9624	March 10, 2020
Boosted trees implementation feedback Machine Learning	0	427	June 26, 2019
[ANN] EvoLinear.jl for Linear Boosting Package Announcements machine-learning	7	723	September 20, 2022
Flux or Knet? General Usage question	7	2331	June 3, 2021

Anyone has an implementation of generic boosting?

Related topics