Why Julia and linear algebra module are incredibly slow, compare to C++ and Eigen

To be honest, before this post, I didn’t know about the overhead of JIT in julia. I read all of your comments and fixed my code. First of all, the BenchmarkTools is applied to evaluate the execution speed and I had significant improvement after JIT warm-up. Secondly, all possible variables of type AbstractArray (dynamic memory allocation) are replaced with SVector and SMatrix. Finnaly, I used @view macro for SubArray slicing.

function compute_homograpy_DLT(M, P)
      A = zeros(8,9)
      for i in 1:size(M)[1]
            A[2 * i - 1, 1:3] = @view M[i,:]
            A[2 * i - 1, 7:9] = - P[i,1] * @view M[i,:]
            A[2 * i, 4:6] = @view M[i,:]
            A[2 * i, 7:9] = - P[i,2] * @view M[i,:]
      end
      return nullspace(A)
end

and

K = @SMatrix [ 1169.19630        0.0 652.98743;
            0.0 1169.61014 528.83429;
            0.0        0.0       1.0]
T = @SMatrix [0.961255 -0.275448 0.0108487    112.79;
       0.171961  0.629936   0.75737  -217.627;
       -0.21545  -0.72616  0.652895   1385.13]

R = @view T[1:3, 1:3]
t = @view T[:,end]

tl = position_in_world(0.0, 0.0)
tr = position_in_world(1123.0, 0.0)
br = position_in_world(1123.0, 791.0)
bl = position_in_world(0.0, 791.0)

p1 = π_projection_function(K, R, t, tl)
p2 = π_projection_function(K, R, t, tr)
p3 = π_projection_function(K, R, t, br)
p4 = π_projection_function(K, R, t, bl)
P = [p1 p2 p3 p4]'

M = @SMatrix [0.0     0.0    1.0;
                    1123.0  0.0    1.0;
                    1123.0  791.0  1.0;
                    0.0     791.0  1.0]

and re-execute the code:

151.783 μs (133 allocations: 13.27 KiB)

BUT, for a fair comparison between julia and eigen, I also modified my c++ code. For this one, I ran the code for 100,000 and the output average time is:

elapsed time (microseconds): 34.2004

So the C++ code is still 5 times faster and I think the reason is coming from null space computation.