RMSprop and Early stopping in Flux.jl

charlesll · February 22, 2019, 3:42am

Hi,

I’m playing with Flux.jl and results are really nice.

At the moment, I would like to use the RMSprop solver with Early Stopping. Anyone have experience with implementing those in Flux?

Thanks!

charlesll · February 24, 2019, 3:31am

OK, I’m going to reply to the second part of my question myself: just use a for loop.

I was not sure that for loops were the way to go as nothing is mentioned in Flux docs about them but they seem to work well.

For those interested, you can grab the loss for the train and valid sets using the callbacks, and then set a condition for early stopping as a function of the losses. A dummy example of the code (using 5-fold CV with MLDataUtils) looks something like:

fold_select = 1
early_stop = 0
for epoch_idx in 1:nb_epochs
    
    train, valid = folds[fold_select] # selection of the datasets
     
    evalcb = () -> (push!(record_loss_n_train, loss(train).data),
    push!(record_loss_n_valid, loss(valid).data))

    Flux.train!(loss, params(m), train, opt, cb = throttle(evalcb, 1))
    
    fold_select += 1 # for selecting the K-fold between 1 and 5
    if fold_select >= 6
        fold_select = 1
    end

    # for early stop
    if record_loss_n_valid[epoch_idx] > record_loss_n_valid[epoch_idx-1]
         early_stop += 1
    end
    if early_stop > 100
        break
    end
end

dellison · March 7, 2019, 7:56pm

I think the “Flux-iest” way to implement this would be have your training callback function check for the early stopping condition, and then have it call Flux.stop() (link) when when the condition is met and you want to break out of the loop.

I’m surprised that this feature isn’t in the documentation! No wonder it wasn’t obvious to you I’ll plan to put together a PR this weekend to update the docs, unless somebody else beats me to it.

charlesll · March 8, 2019, 5:23am

Indeed, this is very good. This allows early stopping without a loop. However I must say that using a loop works well and give a lot of low-level control…

AND RMSprop and tones of other solvers are available. In the source code, again. Not indicated in the docs.

We should also open pull requests for adding RMSprop and the other solvers in the docs.

dellison · March 8, 2019, 6:15pm

However I must say that using a loop works well and give a lot of low-level control…

Yeah, definitely! I like that about Flux- it gives you a huge amount of flexibility.

Volker · April 30, 2020, 2:14pm

Hi,

I would like to create a callback, which to something similiar to the EarlyStopping and restore weights callback of keras in Flux (saving the best model according to the test data and stopping the optimization if no improvement since defined counter), but loss_test_tmp variable isn´t known by the callback. Actually I don´t understand why the callback function knows my test data X_test and y_test and not loss_test_tmp. I define all of them in the same scope in a script.

function evalcb()
    @show loss_test = loss(X_test, y_test)
    if loss_test < loss_test_tmp
        loss_test_tmp = loss_test
        ct = 0
        BSON.@save String(@__DIR__) * raw"model-checkpoint.bson" m
    else
        ct += 1
    end
    if ct > patience
        println("Optimization will be stopped!")
        Flux.stop()
    end
end

I have done it with a custom train- function, but it is much slower.

CarloLucibello · April 30, 2020, 3:03pm

I suggest you define your own training loop, it won’t been any slower than using Flux.train with callbacks, and probably much clearer. Here some examples from the model-zoo

https://github.com/FluxML/model-zoo/blob/master/vision/lenet_mnist/lenet_mnist.jl

github.com

FluxML/model-zoo/blob/master/vision/dcgan_mnist/dcgan_mnist.jl

using Base.Iterators: partition
using Flux
using Flux.Optimise: update!
using Flux.Losses: logitbinarycrossentropy
using Images
using MLDatasets
using Statistics
using Parameters: @with_kw
using Printf
using Random
using CUDA
CUDA.allowscalar(false)

@with_kw struct HyperParams
    batch_size::Int = 128
    latent_dim::Int = 100
    epochs::Int = 20
    verbose_freq::Int = 1000
    output_x::Int = 6
    output_y::Int = 6

This file has been truncated. show original

Volker · May 2, 2020, 8:47am

Thanks. I will have a look

sdwfrost · June 6, 2020, 2:57pm

It would still be nice to know why the callback approach isn’t working; any insights here?

Topic		Replies	Views
Warmup option in FluxTraining.jl General Usage flux , fluxtraining , early-stop	1	190	December 23, 2023
Flux with early stop and epochs New to Julia flux	7	1193	December 22, 2020
Batch size and early stopping in Flux Machine Learning	5	2616	June 17, 2020
How to use Flux.stop() Machine Learning flux	1	576	March 11, 2022
Using EarlyStopping.jl General Usage question , package	0	87	October 21, 2024

RMSprop and Early stopping in Flux.jl

Related topics