I am working with big matrices (size of 30k rows and ~100 columns). I am doing some matrix multiplication and the process would take around 20 seconds. This is my code:

```
@time begin
result = -1
data = -1
for i=1:size
first_matrix = @view data[i * split,:]
for j=1:size
second_matrix = @view Qg[j * split,:]
matrix_multiplication = first_matrix * second_matrix'
current_sum = sum(matrix_multiplication)
global result
if current_sum > result
result = current_sum
data = matrix_multiplication[1,1]
end
end
end
end
```

Trying to optimize this a little more, I tried to use multi-threading (julia --thread 4) to get better performance.

```
@time begin
global result = -1
global data = -1
lock = ReentrantLock()
for i=1:size
first_matrix = @view data[i * split,:]
Threads.@threads for j=1:size
second_matrix = @view Qg[j * split,:]
matrix_multiplication = first_matrix * second_matrix'
current_sum = sum(matrix_multiplication)
global result
if current_sum > result
lock(lock)
result = current_sum
data = matrix_multiplication[1,1]
unlock(lock)
end
end
end
end
```

By adding multi-threading I thought I would get an increase in performance, but the performance got worse (~40 seconds). I removed the lock to see if that was the issue, but still got the same performance. I am running this on a Dual-Core Intel Core i5 (MacBook pro). Does anyone know why my multi-threading code doesnâ€™t work?