I was wondering/looking for clarification on what the best way to serialize models or model weights w.r.t. to necessary non-differentiated against parameters (Batch Norm). I find that directly serializing model object creates deserialization issues with changing versions of not only Flux, but the packages and model infrastructure I develop around the model. It becomes increasingly difficult to load candidate models for my package without completely decoupling my model constructing code from the rest of the code base (custom layers, training loop, featurization, etc). params()
is not a great way to handle capturing the state reflected in BatchNorm and other important non-differentiated parameters in my model. What have been possible solutions for those how have dealt with this issue?