What to do about models not loading in Flux? Critical breakage

I use Flux to make ML models as part of my graduate studies. I read some time ago that you couldn’t trust model’s saved in bson-format, so I made a solution where I would recreate the model by dynamic loading (facing enormous “world age” problems, that I eventually worked out by some quite dirty hacks). As it turned out I found myself in the situation that I could load my models but non-trainable parameters were lost.

I then learned that it was recommended to do the opposite. That is, to save the whole models as bson. I took heed and did that, after first working out the code I needed to repair my old models by traversing and recreating the not saved parameters (I couldn’t get them back exactly but the inference was not impaired). Now, when I upgraded to Julia 1.8, it seems that I can’t load my new models with bson. I read a few posts about this and it seems that everybody agrees that this is not a problem, because all you have to do is to not have saved your models in bson-format.

I have (saved in bson-format), though, and I would like to know what i should do about this. It worked fine before - for a while (a year?), and especially it accommodated the closures I needed to embed in my models (which was, of course, never a problem when I loaded the models as julia-code and then imported the parameters). With julia 1.8 it doesn’t seem to be able lo load the models, even though I trained and saved them with 1.8. I would now like to know how to proceed with my saved and trained models.

I am hoping to spend a lot of time writing code to repair them (like I did the last time), since the alternative is that I have lots of trained models that can’t be accessed.

1 Like

To make this actionable, can you start by posting a full stacktrace you see when attempting to load one of these models on 1.8? If it turns out that it would be quick to fix/monkeypatch BSON.jl, then that would get you unstuck.

Otherwise, one thing you can try is loading models on 1.7 and re-saving using Julia’s Serialization stdlib. This should be forward compatible up to 1.8.

One last question. You mentioned embedding closures in models. Do those closures close over any parameters (especially arrays) that you wanted to be saved?

I may have spoken too soon, because it seems that I can load models with BSON in Julia 1.8.5 (though I have to return to toplevel after loading, before they can be used). I evidently had other things that made the program crash while (or in close conjunction with) loading the model.

I can’t spend too much time analysing this, unfortunately.

What has happened to BSON, though? What’s the problem and what’s the rationale for the change that makes types look goofy, anyway? Can’t the right type be set on creation of objects?

1 Like

Nothing? There have been basically no code changes made to it over the past year, and even before then most were quite minimal. So whatever is causing this:

Should be due to external changes and not the library itself (though they obviously seem to be affecting it).

Can you provide a concrete example of types suddenly “look[ing] goofy” and incorrect types being “set on creation of objects”? I’m not sure what either of those means and I don’t think I’ve encountered them before, so tangible examples are dearly needed. For reference, most of the BSON bugs I remember are some variant of “library crashes when loading this old saved file”.