People might use the term model
to refer to ‘abstract model’ and ‘trained model’. For example, according to Microsoft documentation model refers to learned model, but Lux uses model
to refer to abstract model + Hyperparams.
I don’t mind Lux approach, using Model
to refer to the abstract model (this includes hyperparameters) and ModelArtifact
or LearnedModel
to refer to the artifact needed to generate predictions (which contains learned parameters, weights, coefficients etc).
Usually MLOps
people refer to model
as the ‘abstract model that one can instanciate’ and Model artifact to the file/files someone in production needs to productionize a model (which can be folder containing hyperparams, learned parameters and any metadata or information used to get predictions).
I have heard the use of Model
to refer to Abstract model
as well as Learned model
but I have never seen the use of model artifact
or learned model
to refer to a model that has not been trained.
This seems to naturally yield to using concepts that define each other:
-
Model
vsModelArtifact
:-
Model
is the abstract model defined with hyperparams -
ModelArtifact
refers to model + learned params.
-
-
ModelHyperparams
vsModel
:-
ModelHyperparms
refer to model hyperparams -
Model
refer to model + learned params.
-
Could the previous naming convention pairs (if used together in the same context) confuse anyone?
I like the design that @CameronBieganek proposed but I would not use Options
, there is a word that is understood by everyone in ML which is HyperParameters
or HypeParams
.
Following this nomenclature I would have an implementation for ridge as follows:
using LearnAPI
using Tables
struct RidgeRegressorHyperparams
lambda::Float64
end
RidgeRegressorHyperparams(; lambda=0.1) = RidgeRegressorHyperparams(lambda)
struct RidgeRegressor
hyperparams::RidgeRegressorHyperparams
coefs::Vector{Float64}
importances::Vector{Pair{Symbol, Float64}}
end
function LearnAPI.fit(Hyperparams::RidgeRegressorHyperparams, X, y; verbosity=0)
x = Tables.matrix(X)
s = Tables.schema(X)
features = s.names
coefs = (x'x + Hyperparams.lambda*I)\(x'y)
importances = [features[j] => abs(coefs[j]) for j in eachindex(features)]
reverse!(sort!(feature_importances, by=last))
verbosity > 0 && @info "Features in order of importance: $(first.(feature_importances))"
RidgeRegressor(Hyperparams, coefs, importances)
end
LearnAPI.predict(model::RidgeRegressor, Xnew) = Tables.matrix(Xnew) * model.coefs
LearnAPI.feature_importances(model::RidgeRegressor) = model.importances