MLJ: selecting rows and columns for training in evaluate! for kernel regression

Hi everyone. This is my first post here, I’ve asked this on stackoverflow but have had no luck. Perhaps someone can point me in the right direction.

I want to implement a kernel ridge regression that also works within MLJ. Moreover, I want to have the option to use either feature vectors or a predefined kernel matrix as in Python sklearn.

When I run this code

const MMI = MLJModelInterface    
MMI.@mlj_model mutable struct KRRModel <: MLJModelInterface.Deterministic
        mu::Float64 = 1::(_ > 0)
        kernel::String = "linear"
end
function MMI.fit(m::KRRModel,verbosity::Int,K,y)
    K = MLJBase.matrix(K)
    fitresult = inv(K+m.mu*I)*y
    cache = nothing
    report = nothing
    return (fitresult,cache,report)
end

   
N = 10
K = randn(N,N)
K = K*K
a = randn(N)
y = K*a + 0.2*randn(N)
m = KRRModel()
kregressor = machine(m,K,y)
cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)
evaluate!(kregressor, resampling=cv, measure=rms, verbosity=1)

the evaluate! function evaluates the machine on different subsets of rows of K. Due to the Representer Theorem, a kernel ridge regression has a number of nonzero coefficients equal to the number of samples. Hence, a reduced size matrix K[train_rows,train_rows] can be used instead of K[train_rows,:].

To denote I’m using a kernel matrix I’d set m.kernel = "" . How do I make evaluate! select the columns as well as the rows to form a smaller matrix when m.kernel = ""?

This is my first time using MLJ and I’d like to make as few modifications as possible.

1 Like

I’m not too familiar with kernel ridge regression, but perhaps you can do something like this:

function MMI.fit(m::KRRModel, verbosity::Int, K, y)
    if m.kernel == ""
        K = # construct the smaller matrix
    else
        K = MLJBase.matrix(K)
    end
    fitresult = inv(K+m.mu*I)*y
    cache = nothing
    report = nothing
    return (fitresult, cache, report)
end

Thanks, this could work if I knew the indices of the chosen rows that are passed to fit!

The smaller matrix is built from choosing a subset of rows and columns and sampling their intersection. That is, if K is 10x10 and train_idx=[1,4,6], then the smaller matrix is the 3x3 K[train_idx,train_idx].

Hence, since I need the indices I wonder whether I should define a new sampler, or whatever function passes the data to train!, to obtain the smaller square matrix to be used in training

Paging @ablaom.

@Pere Thanks for your interest in MLJ.

The intended use of evaluate! is to estimate the generalisation error associated with some supervised learning model, by subsampling observations, as in cross-validation, a common use-case. I’m afraid there is no natural way for evaluate! do feature subsampling.

https://alan-turing-institute.github.io/MLJ.jl/dev/evaluating_model_performance/

FYI: There is a version of kernel regression implementing the MLJ model interface, namely kernel partial least squares regression from the package https://github.com/lalvim/PartialLeastSquaresRegressor.jl .

2 Likes

That said, if all you need is to generate (train, test) pairs of indices, according to some resampling strategy, such as resampling=CV(nfolds=3) you can call MLJ.MLJBase.train_test_pairs, as in

julia> resampling=CV(nfolds=3)
CV(
    nfolds = 3,
    shuffle = false,
    rng = Random._GLOBAL_RNG()) @377

julia> MLJBase.train_test_pairs(resampling, 1:10)
3-element Array{Tuple{Array{Int64,1},UnitRange{Int64}},1}:
 ([5, 6, 7, 8, 9, 10], 1:4)
 ([1, 2, 3, 4, 8, 9, 10], 5:7)
 ([1, 2, 3, 4, 5, 6, 7], 8:10)

I see, thanks for the help! I guess I’ll do the train/test loop as usual.

I’d actually already checked the partial LS package, but it calculates the kernel matrix each time the train/predict functions are called, something I want to avoid by using a precomputed matrix.

In case anyone comes across this, for now I’ve achieved what I wanted by setting the complete kernel matrix as a model parameter, which is ok for me since I’m using fairly small matrices. Then, the smaller kernel matrix is built inside fit/predict as @CameronBieganek suggested; the indices of the chosen column/rows are contained in what would be the matrix of features. Here’s the code for the kernel regressor:

using MLJModelInterface
using Random, LinearAlgebra
import MLJBase

const MMI = MLJModelInterface

MMI.@mlj_model mutable struct KRRModel <: MLJModelInterface.Deterministic
    mu::Float64 = 1::(_ > 0)
    kernel::String = ""
    kernel_matrix::Array{Float64,2} = zeros(1,1)
end

function MMI.fit(m::KRRModel,verbosity::Int,x,y)
    x = x[:]
    K = m.kernel_matrix[x,x]
    fitresult = (x,inv(K+m.mu*I)*y)
    cache = nothing
    report = nothing
    return (fitresult,cache,report)
end

function MMI.predict(::KRRModel, fitresult, xnew) 
    x = xnew[:]
    samples, coef = fitresult
    K = m.kernel_matrix[x,samples]
    return K*coef
end

function MMI.clean!(m::KRRModel)
    return 1
end

And the code to test it

N = 1000
svar = 0
K = randn(N,3)
K = K*K'
a = randn(N)
y = K*a + svar*randn(N)

cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)

m = KRRModel(;mu=0.1e-8, kernel="", kernel_matrix = K)
x = MLJBase.matrix(collect(1:N)[:,:])
kregressor = machine(m,x,y)
evaluate!(kregressor, resampling=cv, measure=rms, verbosity=1)
2 Likes

Nice. FYI clean! (if it is overloaded, which is optional) should return a string (a message explaining what has been mutated to enforce invariants for the struct). The fallback returns "".

2 Likes