How to load BSON file of the model build with Flux@0.12.10 to use with Flux@0.13? Flux.Diagonal deprecated problem

I have a Transformer model that was built using Transformers.jl and Flux@0.12.10. After training the model, I saved it using BSON.

Since Flux@0.13, Flux.Diagonal is deprecated and when I try to load model using BSON and latest version of Flux, I get an error:
"ERROR: LoadError: InitError: TypeError: in Type{...} expression, expected UnionAll, got a value of type typeof(Flux.Diagonal)".

Loading problem could be resolved by downgrading Flux version back to 0.12.10. However, I want to use it with the latest Flux version? Is there any possibility to do this without retraining?

It is possible, but it’ll take some work.

  1. Load your model in Flux v12, and replace each Diagonal with Functors.children(diagonal_layer). This will replace each Diagonal layer with a NamedTuple of its parameters, but you can also manually unpack into a different type if you’d like.
  2. Save the modified it back out in the format of your choice, and load it with Flux v13.
  3. Replace the placeholders made in Step 1 with Scale layers. Diagonal is renamed to Scale in v13, so the layers should work as before.
  4. Bonus: consider saving Functors.fmapstructure(model) instead of the model struct directly. This will lose layer types, but it should be far more stable over time because it replaces them with plain old (Named)Tuples. You can load these weights back into a model using Saving & Loading · Flux.

@ToucheSir

When I try to save-load Functors.fmapstructure(model) using BSON, I get an error

ERROR: MethodError: Cannot `convert` an object of type Float64 to an object of type Vector{Any}
Closest candidates are:
  convert(::Type{T}, ::LinearAlgebra.Factorization) where T<:AbstractArray at D:\Programs\Julia-1.8.2\share\julia\stdlib\v1.8\LinearAlgebra\src\factorization.jl:58
  convert(::Type{T}, ::Union{OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where {K, N, var"N+1"}, Union{Base.LogicalIndex{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, Base.ReinterpretArray{Bool, N, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s14"}, var"#s14"}} where var"#s14"<:(OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"), Base.ReshapedArray{Bool, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}}, SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}} where var"#s15"<:(OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"), SubArray{Bool, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, Base.ReshapedArray{<:Any, <:Any, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, var"#s16"}} where var"#s16"<:(OneHotArray{K, N, var"N+1", 
<:CUDA.CuArray{OneHot{K}, N}} where var"N+1"), LinearAlgebra.Adjoint{Bool, <:OneHotArray{K, N, var"N+1", 
<:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Diagonal{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.LowerTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Symmetric{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Transpose{Bool, <:OneHotArray{K, 
N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Tridiagonal{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.UnitLowerTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.UnitUpperTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.UpperTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, PermutedDimsArray{Bool, N, <:Any, <:Any, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}} where {K, N}}) where T<:Array at D:\.julia\packages\PrimitiveOneHot\M7M4C\src\gpu.jl:15
  convert(::Type{T}, ::Union{Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, 
<:CUDA.CuArray{T}}} where T, Union{Base.LogicalIndex{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, Base.ReinterpretArray{T, N, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s14"}, var"#s14"}} where var"#s14"<:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}, Base.ReshapedArray{T, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}}, SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}} where var"#s15"<:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}, SubArray{T, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, Base.ReshapedArray{<:Any, <:Any, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, var"#s16"}} where var"#s16"<:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}, LinearAlgebra.Adjoint{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Diagonal{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.LowerTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Symmetric{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Transpose{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Tridiagonal{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.UnitLowerTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.UnitUpperTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.UpperTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, PermutedDimsArray{T, N, <:Any, <:Any, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}} where {T, N}}) where T<:Array at D:\.julia\packages\NNlibCUDA\kCpTE\src\batchedadjtrans.jl:15
  ...
Stacktrace:
  [1] newstruct!(::IdDict{Any, Any}, ::Float64, ::Function, ::Bool)
    @ BSON D:\.julia\packages\BSON\73cTU\src\extensions.jl:107

I convert my old model to cpu using cpu(model) as well as fmapstructure.
I’ve also tried Serialization.jl, but it doesn’t work either.

Ah yes, my code wasn’t meant to be run literally. If you look at the page that links to, you’ll see the correct function signature. If you don’t need to transform your model at all before structural mapping, you can just pass identity as the callback to fmapstructure

1 Like

@ToucheSir

Maybe, I didn’t understand you correctly. But, I used function signature as it is described in docs. Before saving using BSON, I made following operations:

mf = Functors.fmapstructure(x->x, model)

After that, I save mf using BSON:

bson(save_path, model=mf)

Than I remove old version of Flux and add latest version of Flux and try to load using BSON. I changed x->x on identity and now can load mf. Thank you @ToucheSir. Why my approach does not work correctly?
However, I have rather strange error now, that says, that loadmodel! is not defined

ERROR: UndefVarError: loadmodel! not defined
Stacktrace:
 [1] functor2model(func_file::String, newmodel_file::String, model_type::Symbol)
   @ Main d:\Projects\My_project\modelsU\src\modelsU.jl:43
 [2] top-level scope
   @ d:\Projects\My_project\modelsU\src\modelsU.jl:53

Although, I am using latest version of Flux@0.13.9. Do I need specific version of Flux?

No, all that looks right. I’m not sure why you’re getting that UndefError, can you double check you’re using Flux 0.13 and that you’re either directly calling Flux.loadmodel! instead of just loadmodel!. It’s not exported, so you either need to use it qualified or explicitly import it first.

I tried the approach with saving NamedTuple. The problem is, I can save NamedTuple, but after updating Flux, I can’t load modelfunctor at all. This approach works only if I don’t change Flux version.

MethodError: Cannot `convert` an object of type Float64 to an object of type Vector{Any}
Closest candidates are:
  convert(::Type{T}, ::LinearAlgebra.Factorization) where T<:AbstractArray at D:\Programs\Julia-1.8.2\share\julia\stdlib\v1.8\LinearAlgebra\src\factorization.jl:58
  convert(::Type{T}, ::Union{OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where {K, N, var"N+1"}, Union{Base.LogicalIndex{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, Base.ReinterpretArray{Bool, N, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s14"}, var"#s14"}} where var"#s14"<:(OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"), Base.ReshapedArray{Bool, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}}, SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}} where var"#s15"<:(OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"), SubArray{Bool, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, Base.ReshapedArray{<:Any, <:Any, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, var"#s16"}} where var"#s16"<:(OneHotArray{K, N, var"N+1", 
<:CUDA.CuArray{OneHot{K}, N}} where var"N+1"), LinearAlgebra.Adjoint{Bool, <:OneHotArray{K, N, var"N+1", 
<:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Diagonal{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.LowerTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Symmetric{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Transpose{Bool, <:OneHotArray{K, 
N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.Tridiagonal{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.UnitLowerTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.UnitUpperTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, LinearAlgebra.UpperTriangular{Bool, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}, PermutedDimsArray{Bool, N, <:Any, <:Any, <:OneHotArray{K, N, var"N+1", <:CUDA.CuArray{OneHot{K}, N}} where var"N+1"}} where {K, N}}) where T<:Array at D:\.julia\packages\PrimitiveOneHot\M7M4C\src\gpu.jl:15
  convert(::Type{T}, ::Union{Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, 
<:CUDA.CuArray{T}}} where T, Union{Base.LogicalIndex{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, Base.ReinterpretArray{T, N, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s14"}, var"#s14"}} where var"#s14"<:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}, Base.ReshapedArray{T, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}}, SubArray{<:Any, <:Any, var"#s15"}, var"#s15"}} where var"#s15"<:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}, SubArray{T, N, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, Base.ReshapedArray{<:Any, <:Any, <:Union{Base.ReinterpretArray{<:Any, <:Any, <:Any, <:Union{SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, SubArray{<:Any, <:Any, var"#s16"}, var"#s16"}}, var"#s16"}} where var"#s16"<:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}, LinearAlgebra.Adjoint{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Diagonal{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.LowerTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Symmetric{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Transpose{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.Tridiagonal{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.UnitLowerTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.UnitUpperTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, LinearAlgebra.UpperTriangular{T, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}, PermutedDimsArray{T, N, <:Any, <:Any, <:Union{NNlib.BatchedAdjoint{T, <:CUDA.CuArray{T}}, NNlib.BatchedTranspose{T, <:CUDA.CuArray{T}}}}} where {T, N}}) where T<:Array at D:\.julia\packages\NNlibCUDA\kCpTE\src\batchedadjtrans.jl:15

Do you have a full stacktrace and MWE?