Using pretrained pytorch models in Flux

Hi all. I’m excited to be able to dip my toes back into the Julia pond. I’ve missed the community these last few years.

My question is about how to best leverage existing pytorch resources, such as the extensive pretrained NLP models in 🤗 Transformers, while doing research using Flux and other Julia packages.

I understand I can always call it directly through PyCall as a backup, but I’m curious about the best Julia-only approach to exploiting the massive amount of resources going into pytorch packages.

Thanks in advance, and great work with the language and package ecosystem!

24 Likes

Not sure there is a very simple or clear answer yet, but I’ll point to Torch.jl and Peter’s ongoing GSoC project

2 Likes

Thanks! The GSoC link looks very relevant :grinning:. Do you have more details on the planned scope and timing of that project?

The other way to do that is to export the pytorch net to onnx (torch.onnx — PyTorch 1.12 documentation) and read it in with flux (GitHub - FluxML/ONNX.jl: Read ONNX graphs in Julia)

1 Like

Hi, @tbreloff
The current planned method of the GSoC project will be reading the serialized state_dict and rebuild the model according to the weight names. PyTorch have several way to save a model, one was by extract the state_dict of the model and save it in a pickle like format. I can release some draft code for loading the state_dict next week if you need it.

2 Likes

I there, do we have an update on this idea/project. How should one go about doing this, right now?

https://github.com/chengchingwen/Transformers.jl reads pre-trained weights from HuggingFace and more using GitHub - chengchingwen/Pickle.jl: An experimental package for loading and saving object in Python Pickle format.. You should be able to build off of the latter for other models (e.g. torch.hub).