Hello,
Could anyone give me advice on the best way to parallelize a custom decision tree or forest algorithm on CPU / GPU?
I am currently building my own from scratch to experiment with some of my ideas for multi-class / output scenarios .
Available decision tree packages aren’t exactly aligned to what I want, so the need to do it from scratch. The code I wrote is unparallelized and uses for - loops for building the tree ( since I am not familiar with recursion) and data is saved in a julia dictionary which is causing me problems when I try parallelizing it.
Note: I have only been using Julia for 2 months, and have only ever used python.
Any help would be appreciated
Thanks
Example Code
mutable struct Tree
name
max_depth
min_samples
data
end
function build_tree(tree::Tree,X,y)
for depth in 0:tree.max_depth
# parallelize creating nodes per depth
# Choices
# Threads.@threads
# Threads.@spawn
# Distributed.@distributed
# CUDA.jl ?
for node in 1:2^depth
# create nodes
end
end
end