Function to Calculate Kappa

Rahul · December 9, 2021, 7:06pm

Is there a function to calculate Kappa in MLJ? If no, I wrote a function (shown below) and I am looking for a more efficient implementation.

function kappa(predictions, original)
    # Confusion matrix 
    confmat = MLJ.confusion_matrix(mode.(predictions), original)
    confmat = confmat.mat
    z = (TP = confmat[1,1], FP = confmat[1,2], FN = confmat[2,1], TN = confmat[2,2])
    
    # Calculate kappa
    length_y = length(original)
    
    p_0 = (z.TN + z.TP)/length_y
    p_e_1 = (z.TP + z.FP) * (z.TP + z.FN)/(length_y^2)
    p_e_2 = (z.TN + z.FP) * (z.TN + z.FN)/(length_y^2)
    p_e = p_e_1 + p_e_2
    κ = (p_0 - p_e)/(1 - p_e)
    
    return κ
end

Thanks in advance!

goerch · December 9, 2021, 7:12pm

From a quick look at their documentation it seems they don’t have a MLJ.confusion_matrix! method, so you are probably out of luck (or you would copy that method to do an in place update, how ugly;).

ablaom · December 12, 2021, 9:30pm

Thanks for the suggestion. Noted here: https://github.com/JuliaAI/MLJBase.jl/issues/689

Rahul · January 5, 2022, 5:09pm

@ablaom Just an FYI, I wrote a function for multi-class.

function kappa(yhat, y)
    # Get confusion matrix
    try
        confmat = MLJ.confusion_matrix(mode.(yhat), y) #probabilistic
    catch
        confmat = MLJ.confusion_matrix(yhat, y) #deteministic
    end
    confmat = confmat.mat

    # sizes
    c = size(confmat)[1] # number of classes
    m = sum(confmat) # number of instances

    # relative observed agreement
    diags = [confmat[i, i] for i in 1:c]
    p_0 = sum(diags)/m

    # probability of agreement due to chance
    # for each class, this would be: (# positive predictions)/(# instances) * (# positive observed)/(# instances)
    p_e = 0
    for i in 1:c
        p_e_i = sum(confmat[i, j] for j in 1:c) * sum(confmat[j, i] for j in 1:c)/m^2
        p_e += p_e_i
    end

    # Kappa calculation
    κ = (p_0 - p_e)/(1 - p_e)

    return κ
end

ablaom · January 5, 2022, 10:57pm

Noted, thanks. The try/catch block is discouraged as slow. Basically kappa is a deterministic measure and would be implemented as such in MLJ. The user can compute mode if want to apply it to probabilistic predictions. In any case, MLJ’s evaluate! apparatus will allow you to specify deterministic measures where predictions are probabilistic, automatically calling the model’s predict_mode method instead of predict before passing on to the measure.

ablaom · January 5, 2022, 11:04pm

BTW, a PR is welcome, if you are happy to include tests. Apart from the user guidelines for measures, there are these guidelines for adding new measures.

The new code would live here.

Further detail added at the issue: https://github.com/JuliaAI/MLJBase.jl/issues/689#issuecomment-1006151903

Topic		Replies	Views
Function `confusion_matrix ` in MLJ always fails General Usage mlj	4	646	May 23, 2022
MLJ: Evaluating a probabilistic metric and a deterministic metric at the same time Machine Learning mlj	4	1123	May 20, 2020
Flux.jl confusion matrix General Usage flux	13	3695	June 9, 2022
MLJ confusion_matrix() - MethodError Machine Learning question , package	5	1308	September 18, 2020
Using measure in MLJ to evaluate binary classifier New to Julia machine-learning , mlj	2	1504	August 31, 2021

Function to Calculate Kappa

Related topics