If you wanted to check the validity of your results produced using Julia, you might consider checking a subset of some of them in Python.
Allow me to say something about Python’s JIT. These are my own opinions. I should just say, I have no opinion as to whether one language is “better” than another. I use both, and I like both.
First let us ask what is wrong with Python? Or perhaps better, is there something wrong with Python?
Put it this way, they are not introducing the JIT for fun. The intention is to improve performance.
A little aside: It’s early days for the JIT and at the moment it is very much an experimental thing. 5% performance improvement obviously isn’t worth having, but the intention of their team is to get much better performance improvement over time.
That said, what the JIT team want to achieve is obviously difficult. I am personally somewhat skeptical about whether they will be able to do it, or at least how long it will take them to reach similar performance to Julia. They may never get there, because of how long these things take and the requirement for people to work on it (for free). They probably don’t have the people, or the time.
But to say the performance of Python is not good somewhat misses the point. There are some tasks it can do quickly. Here’s some examples:
- Matrix algebra
- Solve ODEs
- Solve PDEs
- Calculate summary statistics of
numpy
arrays of DataFrame
s
What all of these things have in common (which you may have already known) is that Python goes away and calls some C level routine to calculate the results. That’s why these things are typically quite fast. (Sometimes maybe the code was originally written in Fortran, but the point is the same.)
You only really see a performance penalty in Python if calculations continually swap between the Python and C layer. It isn’t always that obvious when this is happening, because the Python type system is quite complex.
If you call some Python function, and that function depends on another function which is written in Python, then it does become a bit more obvious. One example would be Scipy.optimize
where your function to be optimized is written in Python. It may or may not call some vectorized C code, such as operations on numpy
arrays.
Ok so now we have some idea about where the JIT might provide performance improvements.
Allow me to share some personal experience.
I have never had an issue with Python runtime performance when doing “standard” data science calculations. This is because these commonly used algorithms have been written into C level libraries. Calling <numpyarray>.mean()
is an obvious example.
Where I have run into problems is when trying to write some algorithm which does not have a Python package with a vectorized implementation. (Something written in C/Fortran which benefits from static typing to improve performance.)
If you need to write some very general purpose algorithm which can’t easily be written with an existing framework, then use Julia because you will benefit from the speed the JIT provides. It’s difficult to think of examples because they will always be inherently niche. One example I ran into recently was processing “buy” and “sell” signals with financial data. These signals might be quite complex functions with hysteresis for which there is no easy way to implement them and benefit from existing packages.
All of this is to say, use Julia if
- you need to write some algorithms which do not have standard implementations as Packages which call C code, and you do not have the time to learn two languages (Python + C/CPython)
- you need the absolute maximum performance (very rare use cases)
- you like the fact that it is a functional language or want to use the features it offers as a language which Python does not have
Also please do ask yourself do you need more performance. It’s easy to get wrapped up in these things I wish my simulation would complete faster etc, but do you actually need this?
- Could you scale horizontally (use more computers) to get higher throughput?
- It there a hard latency requirement (like responding to a client) which means your system fails to meet that requirement?
- How much compute time can you save by switching language. Unless it’s at least 50% I would suggest your time is better used elsewhere. You can always scale your simulations down and run the long ones overnight.