Hi all, I’m getting used to ML and Julia and am wondering how to properly make use of MLJ’s api. Specifically, I’m not sure what exactly it wants as arguments to MLJ.roc_curve. Here is some code I use to train and evaluate a model:
using DataFrames, CSV, MLJ, DecisionTree, MLDataUtils, Pipe
df = DataFrame(CSV.File("./data.csv"))
df = @pipe (DataFrames.transform(df, [:DGN, :PRE6] .=> x -> parse.(UInt32, SubString.(x, Ref(4))), renamecols=false) |>
DataFrames.transform(_, :PRE14 => x -> parse.(UInt32, SubString.(x, Ref(3:4))), renamecols=false))
using MLDataUtils
y, X = unpack(df[!, Not("id")], ==(:Risk1Yr); rng=123);
(X_train, y_train), (X_val, y_val), (X_test, y_test) = stratifiedobs((X, y), p=(0.7, 0.2));
rf = RandomForestClassifier()
DecisionTree.fit!(rf, Matrix(X_train), y_train)
preds = DecisionTree.predict_proba(rf, Matrix(X_val))
MLJ.roc_curve(preds[:,2], y_val)
I get the following error:
MethodError: no method matching pdf(::Float64, ::Bool)
Closest candidates are:
pdf(!Matched::Distributions.Logistic, ::Real) at ~/.julia/packages/Distributions/Vkexj/src/univariate/continuous/logistic.jl:81
pdf(!Matched::KernelDensity.BivariateKDE, ::Any, !Matched::Any) at ~/.julia/packages/KernelDensity/bNBAQ/src/interp.jl:32
pdf(!Matched::Distributions.Truncated, ::Real) at ~/.julia/packages/Distributions/Vkexj/src/truncate.jl:133
...
Stacktrace:
[1] _broadcast_getindex_evalf
@ ./broadcast.jl:670 [inlined]
[2] _broadcast_getindex
@ ./broadcast.jl:643 [inlined]
[3] getindex
@ ./broadcast.jl:597 [inlined]
[4] copy
@ ./broadcast.jl:899 [inlined]
[5] materialize
@ ./broadcast.jl:860 [inlined]
[6] roc_curve(ŷm::Vector{Float64}, ym::SubArray{Bool, 1, Vector{Bool}, Tuple{Vector{Int64}}, false})
@ MLJBase ~/.julia/packages/MLJBase/U4Dis/src/measures/roc.jl:48
[7] top-level scope
@ ~/code/julia/mltest/test_mlj.ipynb:17
Any advice?