How to get reproducible results of classification models?: example using DecisionTree

rio · March 24, 2020, 3:12pm

Hi!

I would like to train a classifier (eg. Random Forests) and I would like to get the same results if I train/run the model again. My first attempt was to try to set the seed of the random number generator, like this:

# example from https://github.com/bensadeghi/DecisionTree.jl
using DecisionTree
features, labels = load_data("iris")
features = float.(features)
labels = string.(labels)

# train random forest classifier
# using 2 random features, 10 trees, 0.5 portion of samples per tree, and a maximum tree depth of 6

Random.seed!(1234) # My attempt here!

model = build_forest(labels, features, 2, 10, 0.5, 6)
println(model)

n_folds=3; n_subfeatures=2
accuracy = nfoldCV_forest(labels, features, n_folds, n_subfeatures)

Unfortunately, the resulting model seems to be a bit different each time I run the code. The same for the accuracy. I am using DecisionTree v0.10.1

Please, how could I get reproducible results?

Thank you in advance

Topic		Replies	Views
No variability in xgboost outputs? (XGBoost.jl) Statistics question	10	1062	August 25, 2021
My Random Forest is very slow Performance	10	4756	August 28, 2020
Parallel Random Forest General Usage question	23	5545	May 2, 2019
RandomForestRegressor in Julia Machine Learning	9	1825	July 21, 2022
Repeating simulations New to Julia question , simulations	2	362	May 18, 2022

How to get reproducible results of classification models?: example using DecisionTree

Related topics