The evaluate
and evaluate!
methods in MLJ can accept a vector of metric functions. However, it doesn’t appear that you can evaluate metrics based on probabilistic predictions (e.g. AUC) and metrics based on deterministic predictions (e.g. accuracy) at the same time. Here’s a MWE:
using DataFrames
using RDatasets
using MLJ
using MLJLinearModels
iris = dataset("datasets", "iris")
df = filter(r -> r.Species != "virginica", iris)
y = droplevels!(copy(df.Species))
X = select(df, Not(:Species))
model = LogisticClassifier(penalty=:none)
logistic_machine = machine(model, X, y)
holdout = Holdout(shuffle=true, rng=1)
logistic_auc = evaluate!(
logistic_machine,
resampling = holdout,
measure = auc
)
logistic_accuracy = evaluate!(
logistic_machine,
resampling = holdout,
operation = predict_mode,
measure = accuracy
)
Does anyone know if it’s possible to evaluate auc
and accuracy
at the same time without having to run evaluate!
twice?
In MLJ accuracy
measure is only defined for deterministic classifiers. You could define your custom accuracy measure that works on probabilistic classifiers using the code below.
custom_accuracy(yhat, y) = accuracy(mode.(yhat), y)
MLJ.reports_each_observation(::typeof(custom_accuracy)) = false
MLJ.supports_weights(::typeof(custom_accuracy)) = true
MLJ.orientation(::typeof(custom_accuracy)) = :score
MLJ.is_feature_dependent(::typeof(custom_accuracy)) = :false
MLJ.prediction_type(::typeof(custom_accuracy)) = :probabilistic
Then you could then do
logistic_auc_accuracy = evaluate!(
logistic_machine,
resampling = holdout,
measure = [auc, custom_accuracy]
)
5 Likes
Awesome, thanks @samuel_okon! That’s a good solution. Though since MLJ measures have a prediction_type
trait, it seems like it might be possible to extend evaluate
to accept measures for different prediction types and have evaluate
automatically run each of the necessary types of prediction. Maybe I’ll make a PR for that.
That sounds nice. But the problem is that these measures were defined for either Deterministic
or Probabilistic
classifiers not both. Also applying a Probabilistic
measure on Deterministic
outputs won’t be well defined since vector of UnivariateFinite
is needed.
Yeah, I was thinking that if measure = [auc, accuracy]
, then maybe internally evaluate
could run both predict
and predict_mode
to get the two separate types of prediction. Then it could use the proper type of prediction for each metric.
1 Like