# What is so different in each of these matrix operations?

Hello all,

I am fairly new to Julia and have just started learning the basics of `Matrix`. I wanted to get a brief understanding of what is so different under the hood that is causing the huge time differences in each of the following approaches when technically they are basically the same operations yielding the same result.

The functions:

``````using LinearAlgebra

function train_func(x_train::Matrix{Float64}, y_train::Matrix{Float64})::Matrix{Float64}

return pinv(x_train)*y_train

end;

function train_func_2(x_train::Matrix{Float64}, y_train::Matrix{Float64})::Matrix{Float64}

# this is assuming n_rows > n_columns
return inv(transpose(x_train)*x_train)*transpose(x_train)*y_train

end;

function train_func_3(x_train::Matrix{Float64}, y_train::Matrix{Float64})::Matrix{Float64}

return x_train\y_train

end;
``````

Testing with a toy example

``````julia> x = [1. 2.; 3. 4.; 5. 6.]
3×2 Matrix{Float64}:
1.0  2.0
3.0  4.0
5.0  6.0

julia> y = reshape([1. 2. 3.], 3, 1)
3×1 Matrix{Float64}:
1.0
2.0
3.0

julia> train_func(x,y) ≈ train_func_2(x,y) ≈ train_func_3(x,y)
true
``````

Now if we do the same with slightly larger matrices

``````julia> n, m = 20000, 1000
(20000, 1000)

julia> x = rand(Int64).*rand(Float64, (n,m));

julia> y = rand(Float64, (n,1));

julia> @time train_func(x,y); @time train_func_2(x,y); @time train_func_3(x,y);
3.946035 seconds (30 allocations: 648.657 MiB, 0.31% gc time)
0.247796 seconds (9 allocations: 15.771 MiB)
7.486294 seconds (6.03 k allocations: 168.779 MiB, 0.51% gc time)
``````

Runtime details:

``````julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
``````
1 Like

Oops, that looks like a terrible counter example to what we usually trust…

``````using LinearAlgebra
using BenchmarkTools

function train_func(x_train, y_train)
return pinv(x_train)*y_train
end;

function train_func_2(x_train, y_train)
# this is assuming n_rows > n_columns
return inv(transpose(x_train)*x_train)*transpose(x_train)*y_train
end;

function train_func_3(x_train, y_train)
return x_train\y_train
end;

n, m = 20000, 1000
x = rand(Int64).*rand(Float64, (n,m));
y = rand(Float64, n);
result1 = train_func(x,y);
result2 = train_func_2(x,y);
@assert result2 ≈ result1
result3 = train_func_3(x,y);
@assert result3 ≈ result2

@btime train_func(\$x,\$y);
@btime train_func_2(\$x,\$y);
@btime train_func_3(\$x,\$y);
``````

yielding

``````  2.713 s (30 allocations: 648.66 MiB)
194.618 ms (9 allocations: 15.77 MiB)
4.097 s (6031 allocations: 168.78 MiB)
``````

Kudos for bringing this up!

Edit: but luckily this

``````function train_func_4(x_train, y_train)
return (transpose(x_train)*x_train)\(transpose(x_train)*y_train)
end;
``````

yielding

``````  125.174 ms (7 allocations: 15.28 MiB)
``````

repairs my trust again…

2 Likes

Thanks a lot @zdenek_hurak for sharing the better implementation and the link to the right thread.

1 Like