How Julia and Python compare in 2024?

I heard that Python get JIT in version 3.13, which add some 5%-9% of speed in the benchmarks. Can someone guide me how Julia and Python are look next to each other in the term of speed and role in scientific computing today?

I used scipy and numpy in the past to compute numerical few integrals, by calling good old Fortran procedures from scipy and similar things. I did it, because it was just marginal aspect of my paper and I know that almost no one will question the validity of the result, due to numerical method that I used this way. I still hesitate to use Julia, because I’m not a numeric guy and I don’t know what to answer if someone would ask me “Is that code sound?”. Maybe I to timid, I want to here your thought on that.

If you are not a numerics guy, is then speed the only thing you need?

For me 5-10% does not sound so much, especially since in my experience python is usually even much slower than that compared to Julia, ell of course often related to the effort you put in.
It of course also depends a lot on the area you work in; as a shameless self-plug, we did a comparison for manifolds in https://arxiv.org/pdf/2106.08777, where any Python library with their backends was a few orders of magnitude slower.

But aside from speed, I personally just do not like the style one has to program if you want to use Python nor their ir-reproducibility; at lest for the last few papers where I checked the Python code, I was not able to set that up locally to work. I know there is ways to get reproducibility in Python, but it seems easier to me in Julia.

Concerning the question “Is that code sound?” – I think both language do not differ much in this aspect, but it might depend a bit which package you use and how well-tested that is. On the other hand it might depend a bit, on how comfortable you feel using a language (or package) in the sense that you are sure, you did not use the function wrongly. This might be easier for quadratures, but for more involved systems, this might also be a bit complicayed-

Overall I still feel one is either comparing apples and oranges when comparing Julia and Python – or is talking about personal preferences, since there is so many, especially also social, aspects included in a comparison. For example: Is your whole Lab using Python for the last 20 years of their code base? Then it’s probably Python for you as well.

4 Likes

If you wanted to check the validity of your results produced using Julia, you might consider checking a subset of some of them in Python.

Allow me to say something about Python’s JIT. These are my own opinions. I should just say, I have no opinion as to whether one language is “better” than another. I use both, and I like both.

First let us ask what is wrong with Python? Or perhaps better, is there something wrong with Python?

Put it this way, they are not introducing the JIT for fun. The intention is to improve performance.

A little aside: It’s early days for the JIT and at the moment it is very much an experimental thing. 5% performance improvement obviously isn’t worth having, but the intention of their team is to get much better performance improvement over time.

That said, what the JIT team want to achieve is obviously difficult. I am personally somewhat skeptical about whether they will be able to do it, or at least how long it will take them to reach similar performance to Julia. They may never get there, because of how long these things take and the requirement for people to work on it (for free). They probably don’t have the people, or the time.

But to say the performance of Python is not good somewhat misses the point. There are some tasks it can do quickly. Here’s some examples:

  • Matrix algebra
  • Solve ODEs
  • Solve PDEs
  • Calculate summary statistics of numpy arrays of DataFrames

What all of these things have in common (which you may have already known) is that Python goes away and calls some C level routine to calculate the results. That’s why these things are typically quite fast. (Sometimes maybe the code was originally written in Fortran, but the point is the same.)

You only really see a performance penalty in Python if calculations continually swap between the Python and C layer. It isn’t always that obvious when this is happening, because the Python type system is quite complex.

If you call some Python function, and that function depends on another function which is written in Python, then it does become a bit more obvious. One example would be Scipy.optimize where your function to be optimized is written in Python. It may or may not call some vectorized C code, such as operations on numpy arrays.

Ok so now we have some idea about where the JIT might provide performance improvements.


Allow me to share some personal experience.

I have never had an issue with Python runtime performance when doing “standard” data science calculations. This is because these commonly used algorithms have been written into C level libraries. Calling <numpyarray>.mean() is an obvious example.

Where I have run into problems is when trying to write some algorithm which does not have a Python package with a vectorized implementation. (Something written in C/Fortran which benefits from static typing to improve performance.)

If you need to write some very general purpose algorithm which can’t easily be written with an existing framework, then use Julia because you will benefit from the speed the JIT provides. It’s difficult to think of examples because they will always be inherently niche. One example I ran into recently was processing “buy” and “sell” signals with financial data. These signals might be quite complex functions with hysteresis for which there is no easy way to implement them and benefit from existing packages.

All of this is to say, use Julia if

  • you need to write some algorithms which do not have standard implementations as Packages which call C code, and you do not have the time to learn two languages (Python + C/CPython)
  • you need the absolute maximum performance (very rare use cases)
  • you like the fact that it is a functional language or want to use the features it offers as a language which Python does not have

Also please do ask yourself do you need more performance. It’s easy to get wrapped up in these things I wish my simulation would complete faster etc, but do you actually need this?

  • Could you scale horizontally (use more computers) to get higher throughput?
  • It there a hard latency requirement (like responding to a client) which means your system fails to meet that requirement?
  • How much compute time can you save by switching language. Unless it’s at least 50% I would suggest your time is better used elsewhere. You can always scale your simulations down and run the long ones overnight.
3 Likes

By the way, someone who knows more can perhaps let us know

  • does the JIT compile Python bytecode into machine code?
  • or does the JIT compile the Python code to bytecode?

I would assume the former, but I realized I am not 100% sure on this.

Yes, that’s the point of the JIT, but it doesn’t mean everything is magically fast, or well as fast as possible (then you would get much larger speedup, maybe it’s very good in some cases, I assume only works for functions, wouldn’t e.g. inline across functions like Julia does, by default, or even guided by annotations). Before, Python was compiled to bytecode (still is), and it only interpreted.

The JIT for JavaScript is rather good, better than what Python can do since it’s even more dynamic, and Julia is less dynamic than either so can do better inherently.

Or sometimes the code is in Julia as with (Julia wrapper for state-of-the-art Julia code):

So even calling C from Python is sometimes inferior, in practice.

Python is slow because many aspects of its design can’t be optimized much. For example, you have to do a lot of work at runtime to support instances that can add attributes of arbitrary types, and even __slots__ fixing the attributes must store arbitrary types. Quite a few approaches to optimized Python, or rather very elaborate overlapping sets, introduce structs. CPython’s compilation to bytecode does very little optimization, and while we can expect better than +5-9% in updates, v3.13’s experimental (read: don’t depend on this yet) JIT compilation to machine code can never optimize as much as performant compiled languages can. Slow machine code isn’t paradoxical, it happens all the time in Julia with type instability.

At the same time, Python is fast because it can wrap compiled code from such languages. In spite of the strengths of optimizable compiled languages and the annoyance of juggling multiple languages, there is still a significant demand for interactive glue languages to make compiled units work together without manually AOT-compiling every custom combination. This is not only adequate, it made Python one of the most popular languages, even dominating contexts that demand performance like scientific computing. While some Julia projects like SciML have significant roles, Python projects still have the lead (more users, more money), and I wouldn’t worry about losing performance in many cases.

In many other cases, you lose a lot of performance because gluing several units of compiled code together is too late to optimize them together. NumPy ufuncs, functions that operate on a NumPy array elementwise, are a surprisingly self-contained example. You can turn a Python function wrapping optimal C code into a ufunc with numpy.vectorize, and it’ll be suboptimal because you’re rapidly switching among your C code, the Python glue, and a C loop over NumPy arrays. To optimize a ufunc, you’re instructed to write out your own loop over your C code for each numeric type you need as well as the boilerplate for incorporating NumPy features, all on the C side. While that is a neat bit of composability, you resort to writing the performant code in one language, and you’re still can’t optimize units of AOT-compiled code together. As a consequence, performance-demanding Python packages trend toward large monoliths cramming as many features as possible, and they are either incompatible with each other or they make agreements on sharing dependencies (everybody loves NumPy). Less demanding Python packages get to be smaller and scattered, but people start merging codebases and rewriting in C when performance is demanded.

This results in a demand for an interactive, optimizable language that can be interactively compiled, and Julia is one of the approaches. The parallel to NumPy’s ufuncs is broadcasting, and the chained dot syntax lowering means we don’t even have to write out a full method definition to compile elementwise execution of a kernel over many types of arrays. More importantly, we see many more smaller Julia packages with solo developers or small teams get picked up by the wider community. It’s worth mentioning that when you determine custom compilations worthy of caching instead of redoing every session, you will have to manually do some precompilation work akin to AOT-compilation in a package or something that repurposes one; no language design can magically omit realistic constraints.

Some fast code don’t need to involve higher-order functions or custom types, and they are already designed to occupy most of the runtime with little interruption. That’s something the demand for faster Python tends to miss; base Python is not similar enough to any of the wrapped performant languages to be compiled and optimized together, and it evidently does not help to separately optimize a small fraction of runtime interpreting CPython. For example, if you spend 5% of the runtime on Python API and 95% on the wrapped C code, magically optimizing Python to take 0 time would result in a mere 5.26% speedup, so to reiterate, those performant monoliths can be very adequate. Note that this applies just as much to Julia because it can also wrap AOT-compiled code from other languages and is just as incapable of optimizing those together; unoptimized Julia is also practiced if it grants dynamic behavior and takes up little runtime.

5 Likes