How do you deserialize stateful optimizers in Flux?

findmyway · May 3, 2020, 3:48am

For some stateful optimizers, like ADAM, it has an IdDict field to store momentums of parameters. When deserializing the optimizer with BSON, the id info is lost in the newly created IdDict. This makes it hard to truly recover from some checkpoints.

(I think in TF, each variable has a name so things may be easier?)

CarloLucibello · May 3, 2020, 6:20am

This should be fixed in latest BSON release

findmyway · May 3, 2020, 7:24am

Ah, thanks for reminding!

I just find the trick here is to save the model and the optimizer into a single bson file.

CarloLucibello · May 3, 2020, 7:36am

Why? Won’t saving them separately work the same?

findmyway · May 3, 2020, 7:47am

Yeah, I’m afraid not.

julia> using BSON

julia> a = [1]
1-element Array{Int64,1}:
 1

julia> b = IdDict()
IdDict{Any,Any}b with 0 entries

julia> b[a] = 2
2

julia> b
IdDict{Any,Any} with 1 entry:
  [1] => 2

julia> BSON.@save "b.bson" b

julia> BSON.@save "a.bson" a
 
julia> BSON.@load "a.bson" a

julia> BSON.@load "b.bson" b

julia> a
1-element Array{Int64,1}:
 1

julia> b
IdDict{Any,Any} with 1 entry:
  [1] => 2

julia> b[a]
ERROR: KeyError: key [1] not found
...

Topic		Replies	Views
Deepcopy Flux Model Machine Learning question	8	1509	December 12, 2021
Saving and Loading Error BSON.jl Machine Learning	4	442	July 2, 2022
Saving/loading Flux models with Julia 1.8.x? Specific Domains flux	6	714	April 24, 2023
Unable to save simple FLUX model with BSON Machine Learning	2	588	October 8, 2019
Unable to save and load model (or parameters) with BSON either on GPU or CPU Machine Learning gpu , cuda , flux , bson	2	1251	July 15, 2021

How do you deserialize stateful optimizers in Flux?

Related topics