Hi everyone. This is my first post here, I’ve asked this on stackoverflow but have had no luck. Perhaps someone can point me in the right direction.
I want to implement a kernel ridge regression that also works within MLJ. Moreover, I want to have the option to use either feature vectors or a predefined kernel matrix as in Python sklearn.
When I run this code
const MMI = MLJModelInterface
MMI.@mlj_model mutable struct KRRModel <: MLJModelInterface.Deterministic
mu::Float64 = 1::(_ > 0)
kernel::String = "linear"
end
function MMI.fit(m::KRRModel,verbosity::Int,K,y)
K = MLJBase.matrix(K)
fitresult = inv(K+m.mu*I)*y
cache = nothing
report = nothing
return (fitresult,cache,report)
end
N = 10
K = randn(N,N)
K = K*K
a = randn(N)
y = K*a + 0.2*randn(N)
m = KRRModel()
kregressor = machine(m,K,y)
cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)
evaluate!(kregressor, resampling=cv, measure=rms, verbosity=1)
the evaluate! function evaluates the machine on different subsets of rows of K. Due to the Representer Theorem, a kernel ridge regression has a number of nonzero coefficients equal to the number of samples. Hence, a reduced size matrix K[train_rows,train_rows] can be used instead of K[train_rows,:].
To denote I’m using a kernel matrix I’d set m.kernel = "" . How do I make evaluate! select the columns as well as the rows to form a smaller matrix when m.kernel = ""?
This is my first time using MLJ and I’d like to make as few modifications as possible.
I’m not too familiar with kernel ridge regression, but perhaps you can do something like this:
function MMI.fit(m::KRRModel, verbosity::Int, K, y)
if m.kernel == ""
K = # construct the smaller matrix
else
K = MLJBase.matrix(K)
end
fitresult = inv(K+m.mu*I)*y
cache = nothing
report = nothing
return (fitresult, cache, report)
end
Thanks, this could work if I knew the indices of the chosen rows that are passed to fit!
The smaller matrix is built from choosing a subset of rows and columns and sampling their intersection. That is, if K is 10x10 and train_idx=[1,4,6], then the smaller matrix is the 3x3 K[train_idx,train_idx].
Hence, since I need the indices I wonder whether I should define a new sampler, or whatever function passes the data to train!, to obtain the smaller square matrix to be used in training
The intended use of evaluate! is to estimate the generalisation error associated with some supervised learning model, by subsampling observations, as in cross-validation, a common use-case. I’m afraid there is no natural way for evaluate! do feature subsampling.
That said, if all you need is to generate (train, test) pairs of indices, according to some resampling strategy, such as resampling=CV(nfolds=3) you can call MLJ.MLJBase.train_test_pairs, as in
I see, thanks for the help! I guess I’ll do the train/test loop as usual.
I’d actually already checked the partial LS package, but it calculates the kernel matrix each time the train/predict functions are called, something I want to avoid by using a precomputed matrix.
In case anyone comes across this, for now I’ve achieved what I wanted by setting the complete kernel matrix as a model parameter, which is ok for me since I’m using fairly small matrices. Then, the smaller kernel matrix is built inside fit/predict as @CameronBieganek suggested; the indices of the chosen column/rows are contained in what would be the matrix of features. Here’s the code for the kernel regressor:
using MLJModelInterface
using Random, LinearAlgebra
import MLJBase
const MMI = MLJModelInterface
MMI.@mlj_model mutable struct KRRModel <: MLJModelInterface.Deterministic
mu::Float64 = 1::(_ > 0)
kernel::String = ""
kernel_matrix::Array{Float64,2} = zeros(1,1)
end
function MMI.fit(m::KRRModel,verbosity::Int,x,y)
x = x[:]
K = m.kernel_matrix[x,x]
fitresult = (x,inv(K+m.mu*I)*y)
cache = nothing
report = nothing
return (fitresult,cache,report)
end
function MMI.predict(::KRRModel, fitresult, xnew)
x = xnew[:]
samples, coef = fitresult
K = m.kernel_matrix[x,samples]
return K*coef
end
function MMI.clean!(m::KRRModel)
return 1
end
And the code to test it
N = 1000
svar = 0
K = randn(N,3)
K = K*K'
a = randn(N)
y = K*a + svar*randn(N)
cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)
m = KRRModel(;mu=0.1e-8, kernel="", kernel_matrix = K)
x = MLJBase.matrix(collect(1:N)[:,:])
kregressor = machine(m,x,y)
evaluate!(kregressor, resampling=cv, measure=rms, verbosity=1)
Nice. FYI clean! (if it is overloaded, which is optional) should return a string (a message explaining what has been mutated to enforce invariants for the struct). The fallback returns "".