It will depends on how many contributors are there, I won’t implement things I totally don’t need at the moment (like multi-GPU support for conv, and RNN units). PyTorch is actually quite similar to Chainer, but PyTorch has a more active community.
I’m a physicist working on machine learning, which means sometimes people in machine learning community don’t care about what we need and we have to implement them by ourselves, e.g complex number support, it will take a quite long time (can be years) to merge new things into the main tree of a large project like PyTorch. It is painful and not actually necessary for researchers, check the issue and progress here:
I’m still working on this thing because of our legacy dependencies in the lab, however, I, personally with my lab-mates, collaborators, have switched to Julia entirely, I have built several packages that I need for research:
And more in private.
Some of them (e.g QuHamiltonian.jl) is not quite possible to implement in Python (or it would be quite hard with the ast module). Most of the python packages we write at the moment are just for public and non-pros who does not interested in coding at all. It is a nightmare comparing to Julia to bind C++ with Python, even there is
Furthermore, Julia has the best support for tensor networks among all the languages, Python only has an i tensor wrapper. But Julia has TensorOperations.jl and another coming package of Jutho, and the author of iTensor is also writing a Julia version of it.
I’m actually writing this AD package because of a practical problem, a recent model implemented in PyTorch is too slow and I cannot use a batched trace in PyTorch, because it does not have (I don’t want to write C++ extension, and even I wrote one, it could still slower because of the python wrapper), and I cannot just use a for loop in Python, because it is slow as well, and the lattice libraries are slow as well in Python. I speed up my own model about 10x faster (on CPU) comparing to PyTorch (with almost the same syntax) in just a few days.
And I cannot just move Jutho’s TensorOperations.jl entirely to Python, meta-programming in Python comparing to what we want in TensorOperations.jl does not look possible to implement (or you will create your own DSL beneath Python like many other Python packages do).
If you are really a “practitioner”, under this situation, I believe you will choose Julia (if you don’t want to write your own, add custom operator in Zygote.jl is faster than PyTorch on CPU, and you can use mine in the future) rather than write your own PyTorch C++ extension with its C++ interface.
Being a Practitioner is not the reason to be lazy: if there is a package good enough, then use it, if there is not, then write one.
I don’t suggest to “learn” ANY machine learning package, because what you should learn is the algorithm and theory. Most machine learning package is designed to be intuitive enough that as long as you familiar with the theory, you will know how to use it. If you don’t know how to use it, it is either because the user does not actually know the theory/how this machine learning algorithm works or the package author should change their interface.
But well, if someone just say I don’t want to learn any theory, I just want to call a function and then I run a new deep learning algorithm with it. You will probably need a time machine and a black hole computer then.
I can use Flux.jl/Knet.jl/PyTorch/TensorFlow or just write from scratch as long as I find one of the approach is the fastest. I don’t actually see much difference between those packages, people are making similar interfaces with different implementation now.