Comparing Numba and Julia for a complex matrix computation

I should add that this is almost certainly the wrong way to go about getting performance improvements.

“Highly optimized” code in Python usually means code that is “vectorized” as much as possible — broken into a series of relatively simple array operations that can individually call fast library operations. (For example, calling atan on a bunch of numbers in your code above.) You can write the same sort of code in Julia, of course, but it won’t be magically faster than Numpy — there’s no secret ingredient in Julia that allows us to (for example) subtract two arrays faster.

Instead, the key to unlocking performance benefits is to realize that in Julia you don’t have to limit yourself to “vectorized library” functions for performance. For example, you perhaps shouldn’t create your array of atan values at all, but instead try to combine the atan computation with subsequent or previous computations. (Loops are fast in Julia, and there are packages like LoopVectorization to make them even faster.)

For the same reason, micro-benchmarks of a single elementary operation (like computing a bunch of atan values) are unlikely to reveal much. Numpy is probably already going as fast as it is possible to go for this operation by itself. You really need to look at a non-trivial sequence of calculations.

(If you have trouble with this, feel free to post optimized Python code that performs a dozen lines of “vectorized” calculations, along with sample inputs, and ask for help in producing an optimized Julia equivalent.)

32 Likes