Hi all. I’m excited to be able to dip my toes back into the Julia pond. I’ve missed the community these last few years.
My question is about how to best leverage existing pytorch resources, such as the extensive pretrained NLP models in 🤗 Transformers, while doing research using Flux and other Julia packages.
I understand I can always call it directly through PyCall as a backup, but I’m curious about the best Julia-only approach to exploiting the massive amount of resources going into pytorch packages.
Thanks in advance, and great work with the language and package ecosystem!
Hi, @tbreloff
The current planned method of the GSoC project will be reading the serialized state_dict and rebuild the model according to the weight names. PyTorch have several way to save a model, one was by extract the state_dict of the model and save it in a pickle like format. I can release some draft code for loading the state_dict next week if you need it.