[ANN] LearnAPI.jl - Proposal for a basement-level machine learning API

tecosaur · October 3, 2024, 3:11am

I’ve had a quick read of the proposal, and a bit of the current interface code.

Overall, it looks brilliant @ablaom, so glad you’ve taken the time to continue this effort. I’ve only got a few niggles that I’d like to raise:

Cameron’s minimise function I think could be better named something like trim, shrink, or even downsize or deflate. I think that’s more effective at conveying the intent. Otherwise I’m prone to thinking of model error minimisation, and minimising objectives
In src/types.jl, I see abstract Finite, Iterable, and FiniteIterable types. These seem like interface-kind types and could perhaps be better served by boolean trait-functions (e.g. Tables.istable) or holy-style trait functions?
I do still think that as a package named LeanAPI for Machine Learning models, calling the objects that “learn” from data “learners” rather than “algorithms” is more apt (regardless of how well they generalise), but it seems like this mostly affects the docs now, and I don’t mean to re-commence the bikeshedding.

ablaom · October 4, 2024, 5:03am

The name for the function that maps learning outcomes (output of fit) to something suitable for serialisation. At time of writing, this is minimise in the docs :

trim
minimize
deflate
downsize

0 voters

CameronBieganek · October 4, 2024, 1:17pm

I voted for trim, but I might actually prefer shrink, which was one of the options @tecosaur mentioned.

rdavis120 · October 4, 2024, 2:01pm

I think strip() was used in r models in the past.

ablaom · October 4, 2024, 10:10pm

Thanks for the votes. Sorry, my bad about missing options. Let’s try again:

trim
shrink
strip
deflate
minimize
downsize

0 voters

adienes · October 4, 2024, 10:17pm

(I would always recommend polls like this to be multiple choice)

DoktorMike · October 5, 2024, 7:51am

I would not use shrink in this setting as shrinkage is an established term for penalization in regression models.

ablaom · October 6, 2024, 12:35am

Okay, here’s one more. In LearnAPI I currently have inverse_transform, “broadly” understood. I’ve currently said this any right inverse or approximate right-inverse for transform (maybe any one-sided inverse should be allowed??). This is the same name used in scikit-learn and MLJ, and so it has a lot of intertia for me personally. No promises to change it, but I’d like to know what people think of the name.

I think StatsAPI uses reconstruct and TableTransforms uses revert (transform is apply). Any other name suggestions or comments before I poll this?

era127 · October 6, 2024, 4:47am

I preferred the way it was done with TransformedTargetModel where it would be a regular algorithm with fit() and predict() and the parameters to that algorithm would include the inner regression algorithm along with the transformer function and inverse functions. In which case they are just names of algorithm parameters, and maybe not needed in the api.

Also on the minimize() proposed function, some of the suggested names suggest the trained model is mutated. If the model is an immutable struc, instead of minimize(), maybe it could just be another accessor function to get the struct for the trained algorithm, which represents the information required by predict().

CameronBieganek · October 7, 2024, 2:12am

invert ? (“broadly” understood)

ablaom · October 7, 2024, 11:38pm

Preferred name for the method currently called inverse_transform:

inverse_transform
invert
revert
reconstruct

0 voters

ablaom · October 8, 2024, 12:49am

The trait LearnAPI.functions returns a list of functions that can either be applied to the algorithm struct (e.g, fit), or the output of fit (e.g. predict). See here. Should this be called methods instead?

functions
methods

0 voters

ablaom · February 19, 2025, 9:08pm

Please see this announcement for the first stable release of LearnAPI.jl

Topic		Replies	Views
[ANN] LearnAPI.jl 1.0: General API for ML/statistics Package Announcements announcement , machine-learning , api	1	258	February 19, 2025
MLJ - A machine learning toolbox for Julia Package Announcements	0	2212	April 30, 2019
[ANN] MLJ: an update Machine Learning	7	1275	December 1, 2019
MLJ (Machine Learning in Julia) 0.12 update Package Announcements	4	579	July 23, 2020
JuliaML organization and MLJ.jl Machine Learning	5	1468	August 19, 2019

[ANN] LearnAPI.jl - Proposal for a basement-level machine learning API

Related topics