Python3+@njit faster than julia (rs)

lmiq · May 12, 2025, 7:59pm

Consider the simple following test:

julia> using Chairmarks

julia> function test(f, m)
           size = length(f)
           for i in 1:size
               for j in 1:size
                   dij = m[i,j]
                   f[i] = dij
                   f[j] = -dij 
               end
           end
           return f
       end
test (generic function with 1 method)

julia> f = zeros(1000); m = rand(1000,1000);

julia> @b test($f, $m)
797.525 μs

The same runs faster in python, using the numba @njit:

In [9]: import numpy as np

In [10]: from numba import njit, prange

In [11]: @njit(parallel=False, fastmath=False)
    ...: def test(f, m):
    ...:     size = len(f)
    ...:     for i in prange(size):
    ...:         for j in prange(size):
    ...:             dij = m[i,j]
    ...:             f[i] = dij
    ...:             f[j] = -dij
    ...:     return f
    ...: 

In [13]: f = np.zeros(1000)

In [14]: m = np.random.random((1000,1000))

In [15]: %timeit test(f,m)
455 µs ± 2.97 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

I know how to rewrite the code in Julia to use @turbo and make that faster in this case, that is not the point. Is there a good reason for the Julia compiler not doing the same job as the @njit compiler here? Or is there other reason for the difference?

(@inbounds only does not make a difference there)

Alexander-Barth · May 12, 2025, 8:09pm

Reversing the loops gave me a significant performance spead-up in julia:

# your original function
julia> @btime test($f, $m);
  1.945 ms (0 allocations: 0 bytes)


julia> function test2(f, m)
                 size = length(f)
                 for j in 1:size
                     for i in 1:size
                         dij = m[i,j]
                         f[i] = dij
                         f[j] = -dij 
                     end
                 end
                 return f
             end
test2 (generic function with 1 method)

julia> @btime test2($f, $m);
  799.093 μs (0 allocations: 0 bytes)

lmiq · May 12, 2025, 8:11pm

Oh, yes, of course… dumb. Thanks.

Alexander-Barth · May 12, 2025, 8:13pm

I am wondering if julia is now on par with numba ?

lmiq · May 12, 2025, 8:13pm

Yes it is. But my actual code is different from that, and I may update the post if I don´t find the reason for the difference now.

Alexander-Barth · May 12, 2025, 8:24pm

Surprisingly even with “julia -O3”, the loops in test(f,m) are apparently not reordered.

Oscar_Smith · May 12, 2025, 8:43pm

Loop re-ordering is a very hard compiler optimization. LLVM currently does not do very much/well for loop optimizations like this.

giordano · May 12, 2025, 8:51pm

Isn’t this the performance tip Access arrays in memory order, along columns? You’re looping over m[i,j] in the row-major order, which is better for languages like C and Python, while column-major is what Fortran and Julia use. Having the same exact code for Julia/Fortran and C/Python is pretty much guaranteed to favour one set of languages over the other when there’s this pattern.

lmiq · May 12, 2025, 10:00pm

Yes, of course, I’m simplifying another code where I have the performance issue and ended with that MWE. But it does not capture the original problem, which I’ll have to dig further.

tecosaur · May 13, 2025, 3:44am

Isn’t this what LoopModels is supposed to do? That said, since Chris has moved to Modular I have a nagging suspicion that LoopModels probably won’t see much more development .

DNF · May 13, 2025, 1:08pm

Oh no! Damn the Python universe, it just drowns out everything.

Topic		Replies	Views
General questions from Python user Performance	59	4319	March 8, 2021
Comparing Python, Julia, and C++ Performance broadcast , python	21	33156	November 1, 2018
Benchmarks: Julia vs Python+Numba Performance benchmark , python	4	5424	January 31, 2018
"Very optimised" Julia code 50x slower than Python with compiled functions General Usage	3	599	March 28, 2023
Optimized Python is as good as Julia Performance question	28	25150	June 2, 2025

Python3+@njit faster than julia (rs)

Related topics