Defining MLJ pipelines within a function

ablaom · May 4, 2021, 11:26pm

Question asked on slack:

Hello, I’m new to Julia and MLJ. Is there a recommended way to compose models within functions? I tried placing the code from the “ Lightning Tour ” within a function of my own package, however when compiling that package I’m getting iterated_booster not defined when I can see it is defined (or at least I think it is…).

Here’s the relevant code posted by the commenter:

module Slotter
 
using MLJ
using MLJIteration
using EvoTrees
 
function main()
    Booster = @load EvoTreeRegressor # loads code defining a model type
    booster = Booster(max_depth = 2)   # specify hyper-parameter at construction
    booster.nrounds = 50               # or mutate post facto
    
    iterated_booster = IteratedModel(
        model = booster,
        resampling = Holdout(fraction_train = 0.8),
        controls = [Step(2), NumberSinceBest(3), NumberLimit(300)],
        measure = l1,
        retrain = true,
    )
 
    pipe = @pipeline ContinuousEncoder iterated_booster
 
    max_depth_range =
        range(pipe, :(deterministic_iterated_model.model.max_depth), lower = 1, upper = 10)
 
    self_tuning_pipe = TunedModel(
        model = pipe,
        tuning = RandomSearch(),
        ranges = max_depth_range,
        resampling = CV(nfolds = 3, rng = 456),
        measure = l1,
        acceleration = CPUThreads(),
        n = 50,
    )
 
    X, y = @load_reduced_ames
    mach = machine(self_tuning_pipe, X, y)
    evaluate!(
        mach,
        measures = [l1, l2],
        resampling = CV(nfolds = 5, rng = 123),
        acceleration = CPUThreads(),
        verbosity = 2,
    )
end

ablaom · May 4, 2021, 11:29pm

The problem is that macros evaluate their arguments in the global scope. In your code @pipeline ContinuousEncoder iterated_model throws an error, because iterated_model is defined in the function and not in global scope. There is an open issue (which I couldn’t find just now) to re-implement pipelines without macros. In the meantime, there are various work-arounds. The most robust would be to use the more general learning network syntax described in this manual section, “exporting” your learning network using Method II (no macro). There is admittedly a wee bit of a learning curve here. Simpler workarounds might exist but will depend on what exactly it is you want to do.

CameronBieganek · May 5, 2021, 2:57am

Here’s the relevant github issue, for anyone interested:

https://github.com/alan-turing-institute/MLJ.jl/issues/594

mwlp · May 5, 2021, 5:06pm

Awesome, thanks to everyone

Topic		Replies	Views
How do I tune a pipeline in MLJ? Machine Learning optimization , mlj	1	178	March 4, 2024
Using `load()` from MLJ inside a package General Usage mlj	4	434	January 24, 2023
AutoMLPipeline.jl makes it easy to create complexed ML pipeline structures Package Announcements machine-learning	23	2619	March 9, 2020
MLJ Tuning and Hyperparameters , Regression Performance optimization , machine-learning , mlj	0	283	November 20, 2022
How to make EvoTrees.jl more performant? Performance cuda	25	1601	August 25, 2021

Defining MLJ pipelines within a function

Related topics