Thank you for your response @ablaom
I should have mentioned my problem a little more clearly. What I am looking for is as follows:
In R, as mentioned in the main question, a range
of parameters is automatically created by running the 2 lines of code. To be more specific, for random forest classification in R, a grid of values for n_subfeatures
(or mtry
in R) is automatically created by caret
.
In MLJ, if I want to tune a RandomForest, I will have to do something like:
RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree
rf_model = RandomForestClassifier()
range_rf = range(rf_model, :n_subfeatures, values=[2,6,10])
self_tuning_rf = TunedModel(model=rf_model, resampling=CV(nfolds=10),
repeats=5, tuning=Grid(), range=range_rf, measure=[accuracy, kappa])
rf = machine(self_tuning_rf, X, y)
MLJ.fit!(rf, rows=train)
The problem here is the the TunedModel
function expects a range
. If I run without a range, I will get the following error:
julia> self_tuning_svm = TunedModel(model=rf_model, resampling=CV(nfolds=10),
repeats=5, tuning=Grid(), measure=accuracy)
ERROR: LoadError: ArgumentError: You need to specify `range=...`, unless `tuning=Explicit` and and `models=...` is specified instead.
Stacktrace:
...
All in all, what I want is some sort of implementation where I can run the TunedModel
function without passing anything into the range
argument and it automatically choses one or two or more parameters to tune depending on the model (like caret
chooses mtry
for random forest, cp
for decision tree) and creates a grid based on the type of problem (probabilisitc
) and the dataset (number of features, number of rows, data schema, etc.) that is passed like caret
does. Hope I am clear.