Parallel Random Forest

I also had some trouble compiling XGBoost on the PC but finally got it working with the latest version of Julia 1.0.1. Below is a link explaining how to use XGBoost with Julia on a PC. It would vary only slightly with the Linux version.

Thanks @microgold. I am able to run it now.

Bernhard,

I am considering using Parallel Random Forest in Julia on Amazon Web Services for research purposes. Through the use of its macros, can does Julia send out tasks to multiple nodes with different number of cores on each node, but sidestep Python altogether?

Thanks,

Paul H.

Random Forests as an ensemble method can be easily parallelized in comparison to non-ensemble methods, as a simple aggregation (summation as it does rely on the ‘bagging/bootstrap aggregation’). Each prediction the randomForests_predict model provides is due to a summation of the bagged samples produced, so aggregating based upon nested aggregates should be fine if the implementation is in line with the theory. Eg. doing this with a map-reduce where you run the full data or some sample of the rows with replacement. So on the ‘outside’ should be possible

1 Like