Qs about trained model: (1) testmode! vs excluding Dropout layers; (2) size of BSON file

Maxim_Lubov · October 11, 2021, 5:00pm

(1) I trained the NN model and want to use it further, i.e, load from the disk. What is the best way to predict: (i) with Flux.testmode! and Dropout layers; (ii) without Dropout layers and without Flux.testmode!?
(2) My model is based on Transformers.BERT with additional layers. The size of the model BSON file is huge. How does that happen, since the file doesn’t contain only model structure? Is it more suitable to save only model weights?

ToucheSir · October 11, 2021, 5:25pm

Flux already does not run dropout or update norm layers like batchnorm during inference. You’d only need to explicitly set testmode! if you’d previously forced trainmode! during inference.
BERT is a pretty big model . The BSON file does indeed contain the entire model and not just the structure IIRC, but the non-structural components should have negligible overhead compared to the weights. You could try compressing the generated file separately and seeing if that helps.

Maxim_Lubov · October 11, 2021, 6:33pm

Flux already does not run dropout or update norm layers like batchnorm during inference. You’d only need to explicitly set testmode! if you’d previously forced trainmode! during inference.

Thanks!

BERT is a pretty big model . The BSON file does indeed contain the entire model and not just the structure IIRC, but the non-structural components should have negligible overhead compared to the weights. You could try compressing the generated file separately and seeing if that helps.
Do weights also load with the model? If I save model and then load it and get params, I get an empty array for weights:
@save "model.bson" model model2 = BSON.load("model.bson", @__MODULE__) @show params(model2)

ToucheSir · October 11, 2021, 11:44pm

They should. I would check to see if the actual arrays on the returned model struct match the values you expect instead of pulling them all into an opaque bag with params.

Topic		Replies	Views
How to load BSON file of the model build with Flux@0.12.10 to use with Flux@0.13? Flux.Diagonal deprecated problem Machine Learning flux , bson , save	7	406	December 27, 2022
Unable to save and load model (or parameters) with BSON either on GPU or CPU Machine Learning gpu , cuda , flux , bson	2	1251	July 15, 2021
Saving a model built with Transformers.jl Machine Learning transformers	6	578	March 15, 2023
"The applicable method may be too new" - Inference by loading saved BSON Flux model Machine Learning question , flux , bson	1	862	March 4, 2020
Problem loading saved neural network Machine Learning error , flux , bson	9	555	May 16, 2023

Qs about trained model: (1) testmode! vs excluding Dropout layers; (2) size of BSON file

Related topics