How to update learning rate during Flux training in a better manner?

Think of opa in your first code example as a constructor for an ADAM object/data structure. If you invoke opa repeatedly you are constructing a new ADAM object every time which is slow. Instead of doing that, in your training loop do opt.learning_rate = new_learning_rate (check Flux documentation/code etc to find the actual field name for learning rate), which does not reconstruct the entire ADAM object but only changes a single field value inside an existing one.

Note that this is assuming ADAM as defined in Flux is a mutable struct. If it is not you cannot do something like this. No worries though, because I checked the Flux source code and it is.

For your second question, just calculate the new learning rate in whichever way you want then update the learning rate field in your opt object like above.

1 Like