Implementing the Learn rate scheduling in the NeuralPDE julia package

Aakhash_Sundaresan · November 21, 2023, 12:39am

Hi eveyone, I have a query regarding the implementation of the learn rate scheduling in the NeuralPDE frmework during training. I want to progressively reduce the learning rate of the ADAM optimizer using a decay factor. How do I implement this in the NeuralPDE framework, as I did not find an effective way to do it, neither is mentioned anywhere in the documentation…Can anyone please help?

ChrisRackauckas · November 21, 2023, 10:43am

Just take the resulting OptimizationProblem and use it in a loop with different Adam calls like you would in any other machine learning example. What did you try?

Aakhash_Sundaresan · November 21, 2023, 12:56pm

But everytime I have to remake the problem right? I think this is computationally intensive as the remake function wasn’t working for me…It was taking a lot of time to remake the problem…I’m using Flux instead of Lux due to the ease of saving the network parameters and continue training from the trained parameters. Currently, what I do is, I do multi-stage training say like 500 iterations with one learning rate and the optimization completes and I start the next round with trained parameters and new learning rate…ADAM is fast but in the last round to fine tune the params, when I switch to LBFGS, the optimization becomes dead slow maybe due to the stiffness in the loss landscape or I’m not able to figure out why this is happening for my case…any help is highly appreciable…
Thanks in advance…

ChrisRackauckas · November 21, 2023, 1:05pm

remake is pretty quick as it’s just a pointer change. It should be around 10ns. Can you show what you’re doing?

Aakhash_Sundaresan · November 21, 2023, 1:15pm

Is it that the remake just works for Lux networks and not the flux? Here is what I’m exactly doing:-

Create the symbolic problem constructed from the pdeSystem and the discretization strategy to be utilized

prob = NeuralPDE.discretize(pdeSystem,discretization)
res = Optimization.solve(prob, opt, callback = callback, maxiters = numIters-1)
prob = remake(prob, u0 = res.u)
res = Optimization.solve(prob, opt, callback = callback, maxiters = numIters-1)

But here, the networks parameters (res.u) is just a flattened vector as I’m using Flux…

ChrisRackauckas · November 21, 2023, 1:27pm

That should work fine with Flux as well. Do you have a quick MWE to look at?

Topic		Replies	Views
How to update learning rate during Flux training in a better manner? New to Julia flux	7	2394	December 23, 2023
How can I speed up my Neural ODE? Performance question , diffeq , flux , neural-network	4	865	July 20, 2021
Parameters of the neural network not updating after training in a Neural ODE problem New to Julia sciml , reversediff , differentialequation	13	377	February 16, 2025
Parameters in the neural network not updating after training Machine Learning ode , neural-network	0	41	September 26, 2024
Issue adapting NeuralPDE.jl documentation example code to use case Modelling & Simulations	2	230	December 8, 2023

Implementing the Learn rate scheduling in the NeuralPDE julia package

Create the symbolic problem constructed from the pdeSystem and the discretization strategy to be utilized

Related topics