Hereโs what I had in mind applied to the Iris data:
using AutoMLPipeline, DataFrames
#Get models.
sk= AutoMLPipeline.SKLearners.learner_dict |> keys |> collect;
sk= sk |> x-> sort(x,lt=(x,y)->lowercase(x)<lowercase(y));
iris = AutoMLPipeline.Utils.getiris();
X = iris[:,1:4];
Y = iris[:,end] |> Vector;
#
learners = DataFrame()
for m in sk
learner = SKLearner(m)
pcmc = AutoMLPipeline.@pipeline learner
println(learner.name)
mean,sd,_ = crossvalidate(pcmc,X,Y,"accuracy_score",10)
global learners = vcat(learners,DataFrame(name=learner.name,mean=mean,sd=sd))
end;
@show learners;
Gives the scores (mean, sd) for 49 models. Incompatible models conveniently output NaN
.
Following your suggestions here is code to extract regression/classification models.
m_reg= sk[occursin.("Regressor", sk)];
m_reg= m_reg โช sk[occursin.("Regression", sk)];
m_reg= m_reg โช ["SVR", "ElasticNet", "Ridge", "RidgeCV", "BayesianRidge",
"KernelRidge", "Lars", "Lasso", "LassoLars"];
m_cl= sk[occursin.("Classifier", sk)];
m_cl= m_cl โช sk[occursin.("NB", sk)];
m_cl= m_cl โช sk[occursin.("SVC", sk)];
m_cl= m_cl โช ["LDA", "QDA"];
#47 out of 49 models.
#"OrthogonalMatchingPursuit", "NearestCentroid"
I really like your elegant & minimalist use of pipelines.
The Julia community (and the world) would be a better place if there was a way to merge your package w/ MLJโฆ cโest la vieโฆ