Dear all
I am using functions “xgboost” from the R packages “xgboost” and the
Julia package “XGBoost.jl” that both interface the XGBoost library.
May be I missed something
but I observed surprising outputs from XGBoost.jl.
When I fit one single tree with xgboost (Julia), there is
no variability when I re-run the fitting on the same data,
while I use only fractions of the data rows and columns for
each run (subsample, colsample_bytree), and even for each node of
the tree (colsample_bynode).
Since theses fraction selections are random,
I expected to observe different results between the runs
(note: I well observe this variability, as expected, with xgboost under R).
Below is a reproductible example:
n, p, m = 100, 100, 20
X = rand(n, p) ; y = rand(n) ;
Xnew = rand(m, p) ; ynew = rand(m) ;
Fit of one single tree:
fm = xgboost(X, 1; label = y,
booster = :gbtree,
tree_method = :auto,
num_parallel_tree = 1,
subsample = .6,
colsample_bytree = .8,
colsample_bynode = 1/3,
max_depth = 6,
min_child_weight = 1,
eta = 1,
verbosity = 0)
Output:
pred = XGBoost.predict(fm, Xnew) ;
sum(ynew - pred).^2 / length(ynew) # MSEP
[1] train-rmse:0.237486
0.6847792450571403
When I re-run the fitting, I get the same result:
fm = xgboost(X, 1; label = y, ... Same as before)
pred = XGBoost.predict(fm, Xnew) ;
sum(ynew - pred).^2 / length(ynew) # MSEP
[1] train-rmse:0.237486
0.6847792450571403
I played with argument “seed_per_iteration” but this did not change:
no variability between the runs. Did I miss something?
Actually, I have the same problem when I fit random forests or XGBoost models
with Julia “xgboost” : I don’t observe variability between runs on the same
data while I well observe variability with R xgboost (I used the
same XGBoost parameterization under R and Julia)
fm = xgboost(X, 10; label = y,
booster = :gbtree,
tree_method = :auto,
num_parallel_tree = 1,
subsample = .8,
colsample_bytree = .8,
colsample_bynode = 1/3,
max_depth = 6,
min_child_weight = 1,
eta = .3,
verbosity = 0)
pred = XGBoost.predict(fm, Xnew) ;
sum(ynew - pred).^2 / length(ynew) # MSEP
[1] train-rmse:0.231707
[2] train-rmse:0.184608
[3] train-rmse:0.156864
[4] train-rmse:0.134460
[5] train-rmse:0.115025
[6] train-rmse:0.094434
[7] train-rmse:0.078340
[8] train-rmse:0.069557
[9] train-rmse:0.060034
[10] train-rmse:0.050421
0.425254884669798
When I re-run this code above, no variability in the results is observed.
[Another problem that I have is that “verbosity = 0” does not
remove the printing information (from the doc, it should do,
if I well understood). As shown in the example above,
all round step information is printed anyway. (but this is also the case under R …).
Does somebody know how to have a silent return?]
Thanks for any help