Why is this simple function twice as slow as its Python version

I cannot see that difference here either. But I noticed that

  1. both are using multi-threading. But Julia is launching 8 threads, while Python is launching 4. My laptop has 4 physical cores. Maybe that explains differences in some system?

  2. The times measured fluctuate in about 30% in both cases. Here I can get something between 1.2 and 0.9 seconds for either of them (using %timeit or @btime).

(I’m using Julia 1.6).

Codes
import numpy
from numpy import zeros
from numpy.random import rand
def test(n,k,tmp2):
    for i in range(k):
        t = rand(n, n) 
        for j in range(k):
            tmp1 = rand(n, n)
            tmp2[i*n:(i+1)*n,j*n:(j+1)*n] = t@tmp1
n = 300
k = 30
tmp2=zeros((n*k,n*k))
%timeit test(n, k, tmp2)
# 1.18 s ± 78.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
julia> function test(n::Int64,k::Int64,tmp2::Array{Float64,2})
           for i in 1:k
               t = rand(n, n) 
               for j in 1:k
                   tmp1 = rand(n, n)
                   tmp2[(i-1)*n+1:i*n,(j-1)*n+1:j*n] = t*tmp1
               end
           end
       end;

julia> n = 300
300

julia> k = 30
30

julia> tmp2=zeros((n*k,n*k));

julia> using BenchmarkTools

julia> @btime test($n,$k,$tmp2)
  1.016 s (3660 allocations: 1.23 GiB)
1 Like