FeatureSelector in MLJ Learning Networks

I am trying to build a simple learning network using MLJ, but the FeatureSelector() is not recognized as a valid model used on a node. Is there any alternative to that?
In general, is there a way I can perform column-wise operations with nodes (e.g. remove or select columns) just as you can do with rows?

Hello @petrost could you quote the versions of the various MLJ* packages you’re using?

Unless I misunderstand your question, with MLJ 0.4 and MLJModels 0.4, this should work fine:

using MLJ, DataFrames
X = DataFrame(a=[1,2,3],b=[3,4,5],c=["a","a","b"])
fs = FeatureSelector(features=[:a,:b])
m = machine(fs, X)
Xb = transform(m, X)

Thanks for the reply @tlienart! I have the latest versions of MLJ* packages and the FeatureSelector is working just fine when used with dataframes, just like your example.

However, if you use it with an input of type ‘Node’ it doesn’t work. For example:

using MLJ, DataFrames
X = DataFrame(a=[1,2,3],b=[3,4,5],c=["a","a","b"])
Xs = source(X)

stand_model = Standardizer()
stand = machine(stand_model, Xs)
W = transform(stand, Xs)

fs = FeatureSelector(features=[:a])
m = machine(fs, W)
W1 = transform(fs, W)

In the above example, W is already of type ‘Node’ and when I run it I get

ERROR: LoadError: MethodError: no method matching transform(::FeatureSelector, ::Node{NodalMachine{Standardizer}})

Thanks for taking the time to detail the question, I’ve opened an issue (https://github.com/alan-turing-institute/MLJ.jl/issues/254) as this should work.

See the issue just cited for what I am confident is the resolution to the problem - a usage error not a bug. Feel free to re-open the issue if this is not resolved.

Thanks for posting!

1 Like