Currently, cpu(optimiser) won’t move it. For instance, the state still consists of variables in GPU.

Getting what you want here might require a bit of extra effort. Flux’s current optimizers use IdDicts to map weights to optimizer state and when you move parameters to and from the gpu you create new copies. Result is that the IdDict will not recoginize them as the same weighs and instead you have …

It should work adding Flux.@functor ADAM (or whatever optimizer you are using) to your code. You should open an issue in Flux.jl stating your use case to see if it is worth adding this feature

Flux.@functor ADAM doesn’t work. I thought this is a quite basic functionality as we need to restart the training for large-scale learning. Unless we stay with small problems that can easily finished within couple of hours. And without loading previously saved optimizer, the training can not be re…

Like Dhairya answered on slack have you tried Optimizers.jl ? Check out the tests for usages.

I tried but didn’t figure out how to use it with Flux models? The tested examples are not for neural network models.

@DrChainsaw Wonderful insights. Indeed, manually re-mapping the weights and optimizer state is just too much work. The time spent will enable me to re-implement everything in PyTorch :slight_smile: To get around the issue, I guess the Flux optimizer has to store the optimizer state in a string key …

[image] DrChainsaw: I haven’t followed the development of the new optimizers very carefully, but I suppose both new and current optimizers would require you to manually compare weight values (hoping that there are no duplicates) or use some other way to identify the weights and then remap weigth…

How to work with Flux models, like Dense, in optimiser.jl? Any example code will be appreciated.

Currently we need a bit more internal plumbing (Optimisers.jl is still experimental) to get most Flux layers working OOTB. https://github.com/FluxML/Optimisers.jl/issues/26 has a good summary there. In the meantime, you can try something like these (warning: untested!) functions: # change opt type …

How to move optimiser from gpu to cpu?

Specific Domains Machine Learning

ToucheSir December 12, 2021, 6:29pm 11

So it turns out there is an easier way to go about this, see Deepcopy Flux Model - #9 by ToucheSir.

Topic		Replies	Views
Flux.jl: Save model and optimizer from gpu Machine Learning question	1	497	November 9, 2022
Deepcopy Flux Model Machine Learning question	8	1655	December 12, 2021
Flux.jl: training fails at GPU but works on CPU Machine Learning gpu , flux	1	669	September 19, 2019
Data Science lessons: Making "10 - Neural Networks" run on GPU? New to Julia gpu , flux	4	800	January 14, 2022
Flux training GPU vs CPU different results Machine Learning gpu , flux	7	1275	August 20, 2020

How to move optimiser from gpu to cpu?

Related topics