I tried to read through the Adding Models for General Use · MLJ article, but I can’t find any information on how I can expose the hyper-parameters in my model to MLJ.jl so that I can use MLJ’s infrastructure to do CV. Can someone give some info here so it’s easily searchable for future reference?
Hello, to interface with MLJ you basically need to tick the following boxes:
- import MLJBase in your package
- write a constructor for your model that meets the requirements for MLJ (basically a mutable struct which contains the hyperparameters of your models) that is a subtype of either Probabilistic or Deterministic (in your case if you have a binary classifier without scores, it will go under
Deterministic
), - write a
clean!
method which checks that hyperparameters passed meet constraints for your model - write a
fit
,fitted_params
andpredict
method - add metadata using the
metadata_pkg
andmetadata_model
functions.
To help you with this I would recommend considering:
- any of the examples in MLJModels.jl (for instance the XGBoost interface)
- the MLJLinearModels package which shows maybe more explicitly how you can write an interface from an external package
you’ll see that these interfaces all follow essentially the same pattern so it should be reasonably easy to adapt to your case.
Note: with respect to writing a constructor + clean!
method, you can also use the @mlj_model
macro which does some of the work for you, again please consider examples in MLJModels for instance the interface for NearestNeighbors.jl
My question is more about CV and hyper-parameter training. Each type of models’s hyper parameters are different, so how do I tell MLJ which are the hyper parameters to tune? That’s the thing I am a bit not sure about at the moment.
Once you have a working interface, the HP tuning via CV is done through a TunedModel
, either see the docs or look at this tutorial using XGB with XGb for an example.
Edit:
- how do I tell which are the hp to tune MLJ considers all fields of the model to be hyperparameters that can be tuned
-
each type are a bit different that’s considered automatically, there’s two scenario, either it’s a numeric HP in which case you specify a
lower=
andupper=
and MLJ works out an appropriate sampling that matches the type of the HP or it’s not a numeric in which case you specify avalues=[...]
(e.g. if the hp is a symbol or a string or a metric)
Yes, to emphasise the point already made by @tlienart, in MLJ a “model” struct only contains hyperparameters and not learned parameters. The learned parameters are part of the output of the model fit
method you must implement (and labeled fitresult
in the docs), and part of the input of the predict
(or transform
, etc) method.
Yeah. Implemented a first cut here https://github.com/xiaodaigh/JLBoost.jl#mljjl-integrations