Hello all,
I’m fairly new to Julia. I have a simple array function I’m using for comparing the speed of different languages.
In this case Julia and Cython (subset of Python).
The function in Julia is:
function compute_array_normal(m, n)
x = zeros(Int32, (m, n))
for i = 0:m - 1
for j = 0:n - 1
x[i+1, j+1] = Int32(i*i + j*j)
end
end
return x
end
The multiprocess version in Julia is:
using Distributed
using SharedArrays
addprocs(3)
function compute_array_distributed(m, n)
x = SharedArray{Int32}(m, n)
@sync @distributed for i = 0:m - 1
for j = 0:n - 1
x[i+1, j+1] = Int32(i*i + j*j)
end
end
return x
end
The Cython version is:
# cython: infer_types=True
# distutils: extra_compile_args = -fopenmp
# distutils: extra_link_args = -fopenmp
cimport cython
from cython import nogil
from cython.parallel import prange
from cython.view cimport array as cvarray
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef cvarray compute_array_cython(int m, int n):
cdef cvarray x = cvarray(shape=(m, n), itemsize=sizeof(int), format="i")
cdef int [:, ::1] x_view = x
cdef int i, j
for i in prange(m, nogil=True):
for j in prange(n):
x_view[i, j] = i*i + j*j
return x
running the Julia versions with --check-bounds=no
:
# to compile the function the first time
@everywhere t0 = 100
compute_array_normal(t0, t0)
compute_array_distributed(t0, t0)
@everywhere t = 10000
@time println(compute_array_normal(t, t)[t, t])
@time println(compute_array_distributed(t, t)[t, t])
I get:
199960002
0.803119 seconds (180.73 k allocations: 392.197 MiB, 0.84% gc time, 9.57% compilation time)
199960002
0.415599 seconds (23.74 k allocations: 1.287 MiB, 2.79% compilation time)
running the Cython version, after compiling:
import time
import compute_array as *
t = 10000
s = time()
print(compute_array_cython(t, t)[t-1, t-1])
print(time() - s)
I get:
199960002
0.0892179012298584 seconds
Cython being so much faster seems off to me, especially with how fast Julia can be. I’ve gone over everything twice and can’t see what the cause is, and feel I’ve overlooked something, hence making this post.
Any advice on how to speed this function up in Julia would be greatly appreciated