Loading a trained model in Transformers.jl

kadir-gunel · September 25, 2023, 8:17am

Hello All,

I am having a problem while trying to load a model that is trained from scratch with Transformers.jl. The model it self is a seq2seq model that is very similar to the one in the documentation. Only difference is that I use multiple layers.

During training, I saved the model using Flux.state and BSON: @save.
Now, I try to load the trained model with Flux.loadmodel!. And, I get NamedTuple{(:layers)}. When I call propertynames(model[:layers]) it returns an ntuple with layer numbers. Then if I call propertynames(model[:layers][1]) it returns (embeddings, ), and propertynames(model[:layers][2]) it returns (:blocks, :f) which seem fine to me.

The problem arises when I try to load the TransformerBlock(s). When I call TransformerBlock(model[:layers][2][:blocks], nothing) it throws an error saying :

TransformerBlock(
  Tuple(
    NamedTuple(
      NamedTuple(
        NamedTuple(
Error showing value of type TransformerBlock{Tuple{NamedTuple{(:attention, :feedforward), Tuple{NamedTuple{(:layer, :norm), Tuple{NamedTuple{(:layer,), Tuple{NamedTuple{(:attention_op, :qkv_proj, :o_proj), Tuple{Tuple{}, NamedTuple{(:layer,), Tuple{NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}, NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}}}, NamedTuple{(:α, :β), Tuple{Vector{Float32}, Vector{Float32}}}}}, NamedTuple{(:layer, :norm), Tuple{NamedTuple{(:layer,), Tuple{NamedTuple{(:layers,), Tuple{Tuple{NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}, NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}}}}, NamedTuple{(:α, :β), Tuple{Vector{Float32}, Vector{Float32}}}}}}}, NamedTuple{(:attention, :feedforward), Tuple{NamedTuple{(:layer, :norm), Tuple{NamedTuple{(:layer,), Tuple{NamedTuple{(:attention_op, :qkv_proj, :o_proj), Tuple{Tuple{}, NamedTuple{(:layer,), Tuple{NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}, NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}}}, NamedTuple{(:α, :β), Tuple{Vector{Float32}, Vector{Float32}}}}}, NamedTuple{(:layer, :norm), Tuple{NamedTuple{(:layer,), Tuple{NamedTuple{(:layers,), Tuple{Tuple{NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}, NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}}}}, NamedTuple{(:α, :β), Tuple{Vector{Float32}, Vector{Float32}}}}}}}}, Nothing}:
ERROR: MethodError: _show_leaflike(::Tuple{}) is ambiguous. Candidates:
  _show_leaflike(::Tuple{Vararg{Number}}) in Flux at /home/phd/.julia/packages/Flux/n3cOc/src/layers/show.jl:50
  _show_leaflike(::Tuple{Vararg{AbstractArray}}) in Flux at /home/phd/.julia/packages/Flux/n3cOc/src/layers/show.jl:51
Possible fix, define
  _show_leaflike(::Tuple{})
Stacktrace:
  [1] _all(f::typeof(Flux._show_leaflike), itr::NamedTuple{(:attention_op, :qkv_proj, :o_proj), Tuple{Tuple{}, NamedTuple{(:layer,), Tuple{NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}, NamedTuple{(:W, :b), Tuple{Matrix{Float32}, Vector{Float32}}}}}, #unused#::Colon)
    @ Base ./reduce.jl:1251
  [2] all(f::Function, itr::NamedTuple{(:attention_op, qkv_proj .....

If I remember correctly, Flux saves models in a tree like structure. And somehow when loading this pre-trained model into a TransformerBlock it cannot read it. What should I do ? I do not know the internals of it. Or may be I am using wrong constructor for loading transformer block?

Since the training of this model took nearly 2 days I really need to read this model file.
Could someone guide me through?

B.R.

Topic		Replies	Views
Saving a model built with Transformers.jl Machine Learning transformers	6	578	March 15, 2023
[ANN] Transformers.jl Package Announcements announcement	6	1947	February 18, 2020
How to load models from HuggingFace with Transformers.jl Machine Learning	1	340	June 20, 2023
How to load BSON file of the model build with Flux@0.12.10 to use with Flux@0.13? Flux.Diagonal deprecated problem Machine Learning flux , bson , save	7	406	December 27, 2022
Transformers for NER classification Machine Learning transformers	9	1049	October 12, 2021

Loading a trained model in Transformers.jl

Related topics