Function to Calculate Kappa

Is there a function to calculate Kappa in MLJ? If no, I wrote a function (shown below) and I am looking for a more efficient implementation.

function kappa(predictions, original)
    # Confusion matrix 
    confmat = MLJ.confusion_matrix(mode.(predictions), original)
    confmat = confmat.mat
    z = (TP = confmat[1,1], FP = confmat[1,2], FN = confmat[2,1], TN = confmat[2,2])
    
    # Calculate kappa
    length_y = length(original)
    
    p_0 = (z.TN + z.TP)/length_y
    p_e_1 = (z.TP + z.FP) * (z.TP + z.FN)/(length_y^2)
    p_e_2 = (z.TN + z.FP) * (z.TN + z.FN)/(length_y^2)
    p_e = p_e_1 + p_e_2
    κ = (p_0 - p_e)/(1 - p_e)
    
    return κ
end

Thanks in advance!

1 Like

From a quick look at their documentation it seems they don’t have a MLJ.confusion_matrix! method, so you are probably out of luck (or you would copy that method to do an in place update, how ugly;).

1 Like

Thanks for the suggestion. Noted here: Add Cohen's kappa to performance measures. · Issue #689 · JuliaAI/MLJBase.jl · GitHub

3 Likes

@ablaom Just an FYI, I wrote a function for multi-class.

function kappa(yhat, y)
    # Get confusion matrix
    try
        confmat = MLJ.confusion_matrix(mode.(yhat), y) #probabilistic
    catch
        confmat = MLJ.confusion_matrix(yhat, y) #deteministic
    end
    confmat = confmat.mat

    # sizes
    c = size(confmat)[1] # number of classes
    m = sum(confmat) # number of instances

    # relative observed agreement
    diags = [confmat[i, i] for i in 1:c]
    p_0 = sum(diags)/m

    # probability of agreement due to chance
    # for each class, this would be: (# positive predictions)/(# instances) * (# positive observed)/(# instances)
    p_e = 0
    for i in 1:c
        p_e_i = sum(confmat[i, j] for j in 1:c) * sum(confmat[j, i] for j in 1:c)/m^2
        p_e += p_e_i
    end

    # Kappa calculation
    κ = (p_0 - p_e)/(1 - p_e)

    return κ
end
1 Like

Noted, thanks. The try/catch block is discouraged as slow. Basically kappa is a deterministic measure and would be implemented as such in MLJ. The user can compute mode if want to apply it to probabilistic predictions. In any case, MLJ’s evaluate! apparatus will allow you to specify deterministic measures where predictions are probabilistic, automatically calling the model’s predict_mode method instead of predict before passing on to the measure.

BTW, a PR is welcome, if you are happy to include tests. Apart from the user guidelines for measures, there are these guidelines for adding new measures.

The new code would live here.

Further detail added at the issue: Add Cohen's kappa to performance measures. · Issue #689 · JuliaAI/MLJBase.jl · GitHub

2 Likes