[ANN] LearnAPI.jl - Proposal for a basement-level machine learning API

I’ve had a quick read of the proposal, and a bit of the current interface code.

Overall, it looks brilliant @ablaom, so glad you’ve taken the time to continue this effort. I’ve only got a few niggles that I’d like to raise:

  • Cameron’s minimise function I think could be better named something like trim, shrink, or even downsize or deflate. I think that’s more effective at conveying the intent. Otherwise I’m prone to thinking of model error minimisation, and minimising objectives :stuck_out_tongue:
  • In src/types.jl, I see abstract Finite, Iterable, and FiniteIterable types. These seem like interface-kind types and could perhaps be better served by boolean trait-functions (e.g. Tables.istable) or holy-style trait functions?
  • I do still think that as a package named LeanAPI for Machine Learning models, calling the objects that “learn” from data “learners” rather than “algorithms” is more apt (regardless of how well they generalise), but it seems like this mostly affects the docs now, and I don’t mean to re-commence the bikeshedding.
5 Likes

The name for the function that maps learning outcomes (output of fit) to something suitable for serialisation. At time of writing, this is minimise in the docs :

  • trim
  • minimize
  • deflate
  • downsize
0 voters
1 Like

I voted for trim, but I might actually prefer shrink, which was one of the options @tecosaur mentioned.

1 Like

I think strip() was used in r models in the past.

3 Likes

Thanks for the votes. Sorry, my bad about missing options. Let’s try again:

  • trim
  • shrink
  • strip
  • deflate
  • minimize
  • downsize
0 voters

(I would always recommend polls like this to be multiple choice)

I would not use shrink in this setting as shrinkage is an established term for penalization in regression models.

5 Likes

Okay, here’s one more. In LearnAPI I currently have inverse_transform, “broadly” understood. I’ve currently said this any right inverse or approximate right-inverse for transform (maybe any one-sided inverse should be allowed??). This is the same name used in scikit-learn and MLJ, and so it has a lot of intertia for me personally. No promises to change it, but I’d like to know what people think of the name.

I think StatsAPI uses reconstruct and TableTransforms uses revert (transform is apply). Any other name suggestions or comments before I poll this?

1 Like

I preferred the way it was done with TransformedTargetModel where it would be a regular algorithm with fit() and predict() and the parameters to that algorithm would include the inner regression algorithm along with the transformer function and inverse functions. In which case they are just names of algorithm parameters, and maybe not needed in the api.

Also on the minimize() proposed function, some of the suggested names suggest the trained model is mutated. If the model is an immutable struc, instead of minimize(), maybe it could just be another accessor function to get the struct for the trained algorithm, which represents the information required by predict().

2 Likes

invert ? (“broadly” understood)

1 Like

Preferred name for the method currently called inverse_transform:

  • inverse_transform
  • invert
  • revert
  • reconstruct
0 voters

The trait LearnAPI.functions returns a list of functions that can either be applied to the algorithm struct (e.g, fit), or the output of fit (e.g. predict). See here. Should this be called methods instead?

  • functions
  • methods
0 voters