Yes that could be interesting, but I should be a bit clearer, our syntax for pipelines and learning networks already allows to do all this if I’m not mistaken, and is not much more complicated, additionally (and very importantly) it gives names to operations which is necessary for hyperparameter tuning.

When I initially looked at your package I thought about something a bit different: a syntax to allow the definition of new features based on an initial table. This is very common workflow and while it is already possible to do it with MLJ, I feel there could be a way (possibly a macro but not necessarily) to do this more easily for a range of simple cases.

The process as I imagine it would be to go from `X -> X'`

where `X'`

has some or all of `X`

's columns as well as “derived” features. A trivial example is that `X`

has three columns: `x,y,z`

and that we want to get `X'`

with six columns `x, y, z, x^2, y^2, z^2`

, of course you may want to do this with more complicated functions / combinations, but this seems to me like a very common workflow. Having a way to neatly define how to get these derived features and feed the lot to a learning network or pipeline would be great.

In a way something similar to StatsModels.jl’s `@formula`

though probably more geared to explicitly extend the feature matrix.