I have a Python script where the major speed hurdle is a single function in the statsmodels package. However, it seems that if I use PyCall to use the same Python function in Julia, the speed hurdle persists to the same degree. Is it true that I need to rewrite the function (i.e., no pyimport) in native Julia for me to expect a speed boost? I was maybe under the mistaken assumption that simply porting over my Python code to Julia would make everything faster…
Here are two identical scripts in Python and Julia, with the corresponding speeds to complete:
Also very likely that function is implemented in C or Fortran under the hood in the python implementation, and thus transporting a single call of that function from python to julia, or rewriting the function in julia won’t bring any important speedup. The advantage of Julia is that IF you rewrite that function in pure Julia you may get similar speeds than the C our Fortran code python is calling, without having to deal with two different languages and their interfaces.
If calling Python code as-is from Julia could somehow make it magically faster without changing its behavior, then Python itself would just do that and already be faster. The reason that Julia is faster than Python is that it works differently.
Maybe something like PyPy that doesn’t involve the C interface would be magically faster. But that’s faster interpreter due to JITting. And Julia is not a python interpreter. I wonder how prevalent this misconception is?
This is the first time I have come across this particular misconception.
Yeah, I guess my misconception does come off as stupid. In my mind, the syntax between Julia/Matlab/Python are very similar – I thought for most Python packages there was a way to automatically translate between the languages on the back-end, such that PyCall is doing the work of converting Python functions into native Julia. I was watching YouTube videos about how Julia is so much faster than Python and it can even already handle all my Python packages – this is not as amazing sounding if I knew that if I did use my Python packages then they don’t get the Julia benefits the speaker was talking about.
Regardless, thanks for the replies! I’ll have to spend more time with this language to better undestand things.
This isn’t uncommon at all — and you shouldn’t feel stupid for asking it. It can be tricky to simultaneously talk about how great Julia itself is and how great Julia’s interoperability is without some muddy waters.
The beautiful thing about the interoperability is that you don’t need to replace your entire codebase — you can incrementally update the parts that are taking the biggest amounts of time. The tricky part is that Python isn’t written in Python (largely), so if your Python usage is largely calling into highly-optimized libraries (written in C/C++/Fortran) you may not see a significant speedup even after converting to Julia. Where you’ll see the biggest speed gains are when you’re writing your own for loops and algorithms — or possibly calling some not-so-optimal libraries.
Effectively, if the speed hurdle is on the functions you call from your python scripts, and those functions are implemented in low-level languages, you won’t get a great benefit from using Julia instead. If, on the other side, the speed hurdle is in your python script, or in functions written in python itself, to solve those hurdles you will probably have to re-implement those functions in a low level language or, now, in Julia.
I have run into multiple people in real life in my industry (scientific research/engineering) who think that Julia is a new Python interpreter implementation.
I will say that Julia offers you a lot of great tools to write good code. If you decide to implement some functionality in Julia you may find that it is faster because you implement a more sophisticated algorithm for solving that problem since Julia makes working with more sophisticated ideas simpler.
For example, while Python has dictionaries, C doesn’t. So if python is calling C to do the “fast stuff” but the C programmers are using some algorithm that doesn’t take into account the properties of a dictionary, then reimplementing the thing in Julia and taking into account that there are dictionaries could give you a massive improvement.
Similarly for many things in DataStructures.jl like Deques or Priority Queues or Disjoint Sets or whatever.
most of the time the big improvements come from better algorithms. Julia makes writing sophisticated algorithms a LOT nicer than doing so in C or C++ IMHO
I completely agree. My point was that for someone only using python to call internal libraries to do most of the work, the benefits of Julia might not be clear. This is absolutely not my case, for example. Being used to program in Fortran, I always disliked python, because I had to adapt my problems to the use of those libraries, when they existed, and if they didn’t exist I had to go back to Fortran. With Julia I can both use powerful libraries if I want and they are available, or write my own code. But I understand that many people are just using the language as a parser for black-box algorithms and, in that case, not much can be improved by changing the language.