Will Julia be more efficient than PyTorch

mindaslab · March 27, 2022, 4:26pm

Hello All,

I am very much interested in deep learning, but I realize that I cannot buy expensive hardware that industries use to train things like GPT-NeoX and so on.I wonder if a neural network is built and trained on Pytorch, I wonder if the same is done on fluxjl, will the Julia implementation require less amount of GPU / compute?

Why I am asking this?

First I don’t know much about deep learning.

Second I want to challenge Tabnine, with a free software version of code completion assistant, I named it as GNU Ghost.

What I know is this

Network of volunteer computers will be always powerful than any system put by a corporate.
If I can find a way to distribute neural network training asynchronously across networked computers, then we can make better free as in freedom machine learning models.

According to point 2, training of models will be slow, but that’s okay. I wish to do such a project. Want to know if you people think I am insane or not?

jpsamaroo · March 27, 2022, 5:29pm

To start with: Flux will not be faster than PyTorch in most cases right now, simply due to PyTorch having more money and developer resources. I also think you’ll have a tough time beating Tabnine (and really GPT-3), which have a ton of developer resources and experience behind them.

That said, I think this is a laudable goal, and something that I would love to see happen in Julia. I think that our ML stack is rapidly getting to the tipping point where we can beat out PyTorch for certain problems, especially as projects like https://github.com/DhairyaLGandhi/ResNetImageNet.jl and https://github.com/DhairyaLGandhi/DaggerFlux.jl gain steam.

I think your first step for this project should be to get a much simpler (than GPT-3) auto-completion model setup and running in Julia, find open datasets to use for training, and setup an end-to-end demo of how to use these together to do basic text auto-completion on common hardware. This demo should be able to run without any non-Julia dependencies, and should take less than 1 week to train with the CPU (with maybe 8-16 threads) and less than 1 day to train with a common laptop or desktop GPU. The auto-completion results don’t have to be great, they just need to show some amount of sensibility. We can work on improving the model later.

If you can get this working on your hardware and achieve the above goals, I’d be willing to help port this to AMD GPUs, and also with porting to a multi-server setup. We can consider how to do distributed multi-user training securely and anonymously later (since many users will want to try out your code locally first to determine if it’s even worth investing time and compute resources on this approach). If we can get both CUDA and AMD GPUs working, I will gladly dedicate 1 full-time (AMD) GPU to training/development for this project.

Topic		Replies	Views
Flux ready for a beginner deep learning project? Machine Learning flux	31	8685	June 20, 2019
Is it a good time for a PyTorch developer to move to Julia? If so, Flux? Knet? Machine Learning	52	25038	January 11, 2021
Understanding the need for Torch.jl? Machine Learning	23	6097	November 4, 2024
Flux running slow? Machine Learning	16	2752	August 19, 2021
[support] Pytorch for Julia General Usage	10	6396	November 27, 2021

Will Julia be more efficient than PyTorch

Related topics