Why is Python, not Julia, still used for most state-of-the-art AI research?

I downloaded yesterday some scripts of how taxes are computed that the French Tax ministry released (it is so complicated in France that you can’t compute or check the tax by yourself…)

It ended up it was a bunch of COBOL scripts (2020)…


I think the answer is very simple:

Python is a very simple language, with a lot of consolidated libraries in the area (Scikit-learn, Numpy, PyTorch, Keras, OpenCV .API …). The majority of pre-processing algorithms are also available in Python (as sci. The Julia equivalents are, obviously, not so complete yet. Also, many researchers in the AI area are reluctant to learn another programming language (Python is the easier).

Also, the performance topic is not important for many (when the training/evaluation takes a lot the performance of the rest of the system is not so important).

Another important is the spread of the language. In research you want to collaborate, and make your algorithm/technique available. Python is a lot more known than Julia, and you will have more influence making available your proposal in Python.


FYI: I see multi-GPU done (while I guess not Distributed, then) in 2018, with Flux:

Also interesting:

On the Nvidia blog in 2017:

On average, the CUDAnative.jl ports perform identical to statically compiled CUDA C++ (the difference is ~2% in favor of CUDAnative.jl, excluding nn).

Incidentally, one of those countries is also a birthplace of Julia…


Nice finds!

Yup, I think this is possible on current CUDA.jl as well. However, multi-gpu training of the same model (whether on the same machine or across machines) requires functionality that isn’t implemented anywhere in the Flux ecosystem yet.

Seems like all of these benchmarks are prior to the Flux Zygote transition (I see lots of Tracker). Knet likely performs even better now (and should probably receive more love), but Tracker → Zygote was a noticeable performance regression for certain workflows. ref. https://fluxml.ai/2020/06/29/acclerating-flux-torch.html, Flux vs pytorch cpu performance - #15 by d1cker, https://github.com/FluxML/Flux.jl/issues/886.

For more holistic comparisons, see also
Is it a good time for a PyTorch developer to move to Julia? If so, Flux? Knet? - #18 by dfdx and https://discourse.julialang.org/t/where-does-julia-provide-the-biggest-benefits-over-other-ml-frameworks-for-research (the latter was started by a PyTorch contributor).

1 Like

From your link, some research using Julia (and I also edited my top post, with state-of-the-art research from one of the main Knet/Julia guy):

Will Julia be as fast as optimized cpp + python in machine learning? Most machine learning core parts in cpp are written by experts so maybe they do not have the two-language problem?

If well written julia, then yes. In some cases it can be faster (If python-c interop creates bottlenecks). Also, even just using tensorflow or most other python deep learning libraries has a 2 language problem even if you’re just using python. python+tensorflow is 1000% uglier than pure python.

PyTorch has had more development and resources. Flux.jl is still quite early days.

Any opinions on Knet vs Pytorch?


Are there any companies using Flux and Knet?

i wonder too. must be, but I am not sure at what scale though.

I kind of feel like the two language problem is getting less important over time. Lower level languages are getting easier to use. Rust and Cpp with modules are very appealing.


As far as I know Invenia uses Flux in production.


This is the elephant in the room. PyTorch is really good for many (not all) AI research tasks because it is mature, stable, fast, flexible, has lots of online help, and has cool features like mixed-precision training and GPU parallelism.

I am backing the Julia horse but it’s a bet I expect to pay off over a few years once the Julia ML ecosystem matures. The answer to the question in the title is that the Python ML ecosystem is better right now for most use cases, plus some degree of inertia as others have mentioned. It doesn’t hurt our chances, or dismiss the amazing contributions thus far, to admit that Python has its charms.


I don’t know, maybe you can get all the important (speed) benefits of PyTorch with (without giving up any Julia benefits?):

EDIT: There’s already (available since at least April):

The short answer to the question below was “Yes and Flux”, I also link directly to a more detailed answer:

I’m not sure how good GitHub - boathit/JuliaTorch: Using PyTorch in Julia Language
is. It’s a wrapper, but I tried to install in, and I see now it’s not yet a proper package (so, neither registered), so you have to git clone or download.

I guess it is/would be nice to have it (with easy installation), while I’m not so sure you would use Julia to its full potential (nor sure you could mix with Julia’s frameworks), so I think a migration to a Julia-only solution (e.g. maybe with other registered package above?) should be on people’s radar.


I think this is the easy answer yet not necessarily the right one or at least not the whole story.
PyTorch started at September 2016. I remember using version 0.3.x which was released less than 18 after the start.
Flux has started around ~May 2016 according to the first commit. Do you find it as usable as versions 0.3 or 0.4 of PyTorch?
My point isn’t the exact dates but that the resources spent on PyTorch in 18 month are probably something achievable in 4 years of Flux. So I’d expect the maturity and capabilities to be similar. Especially if, as claimed, Julia gives the developers the ability to be more effective due its features. Flux is also developed by very capable Julians, so one would expect they can take full advantage of its capabilities.

I think it has to do with the goals set to Flux. PyTorch seems to me as a pragmatic approach. Each step getting better. Flux seems to try making many steps forward at once. I might be wrong on the analysis but looking from the side it looked just like the Marshmallo Challenge.
I found the early Knet to be much more intuitive and “PyTorch” like. Unfortunately it doesn’t get enough of the spotlight. Moreover later it tried to be more Flux like. I wish it was just a PyTorch clone. Simple and straight to the point.

For my subjective idea I would also say Python, as someone mentioned here, is much easier language than Julia. Coming from MATLAB looking at packages of Julia seems to me as an act of wizard. IT has to do with the Julia community being built with highly capable programmers. Sometimes, to me, things seems too much elegant yet not simple.

I have never encountered MATLAB code I couldn’t understand (MATLAB, to me, the easiest language out there). I can handle some Python code though having less Python hours than Julia. Yet most code in Julia’s packages seems like black magic to me. It might be like to many other engineers being very good at what they do yet only as good as they need in programming. They find Python to be a more welcoming tool, not only more popular (It is popular due its simplicity).


PyTorch is largely porting the existing lua Torch package to python as I understand it. Torch development started in 2002, 10 years before even Julia was around much less Flux.

I’d be interested to see some examples of code that you find confusing in Julia packages. Coming from programming in python (and matlab before that), I found in general that my mental model of what a given piece of code is doing is simpler in julia than python which is itself more straightforward than matlab.


Yes, this is more so for package developers.

There is a distinction between users and developers.

For users, Julia should be the easiest language, easier than python or matlab, because hopefully the API works like magic.

For contributors, it’s a different situation because the contributor needs to have more advanced knowledge, the underlying API might use metaprogramming.

So in Julia there is a wider spectrum of difficulty. It is easy to get started as a user, but as a contributor it is necessary to have deeper more advanced knowledge.


I totally agree. You can write Julia like MATLAB and probably they will be as readable.
The problem is when you want people to engage. The bar is high. It is not a shortcoming of Julia. It is a property. The dynamic range of the complexity of Julia is very high. Package developers who choose the upper part of the capabilities of language means they target smaller part of the Julia users.
The whole question was why does Python get more engagement. So it could be a factor.


Don’t forget Chainer! PyTorch has a rich heritage and very much started off of an evolution of those two libraries.

I would go so far as to claim (controversially) that Flux’s continued emphasis on experimentation and “big ideas” to the detriment of stability and polish negatively impacted adoption for the entire Julia DL ecosystem after the initial hype subsided. Thankfully, Flux has found more of a middle ground now and Knet is gradually gaining name recognition, so I’m hopeful that some of the stereotypical criticisms around stability, completeness, docs, etc. can be assuaged.

If anything, this is testament to how subjective “easy” is. As someone who has no “hard science”/engineering background outside of CS, I found MATLAB bizarre, unintuitive and extremely frustrating to use. For example, array indexing, slicing and function/class definitions are completely different from what you’d see in other programming languages, whereas I’d be hard pressed to find some part of Julia syntax that doesn’t exist in some mainstream programming language. Again, past experience matters a lot here.