What is so different in each of these matrix operations?

Hello all,

I am fairly new to Julia and have just started learning the basics of Matrix. I wanted to get a brief understanding of what is so different under the hood that is causing the huge time differences in each of the following approaches when technically they are basically the same operations yielding the same result.

The functions:

using LinearAlgebra


function train_func(x_train::Matrix{Float64}, y_train::Matrix{Float64})::Matrix{Float64}

    return pinv(x_train)*y_train

end;

function train_func_2(x_train::Matrix{Float64}, y_train::Matrix{Float64})::Matrix{Float64}

    # this is assuming n_rows > n_columns
    return inv(transpose(x_train)*x_train)*transpose(x_train)*y_train
    
end;

function train_func_3(x_train::Matrix{Float64}, y_train::Matrix{Float64})::Matrix{Float64}

    return x_train\y_train

end;

Testing with a toy example

julia> x = [1. 2.; 3. 4.; 5. 6.]
3×2 Matrix{Float64}:
 1.0  2.0
 3.0  4.0
 5.0  6.0

julia> y = reshape([1. 2. 3.], 3, 1)
3×1 Matrix{Float64}:
 1.0
 2.0
 3.0

julia> train_func(x,y) ≈ train_func_2(x,y) ≈ train_func_3(x,y)
true

Now if we do the same with slightly larger matrices

julia> n, m = 20000, 1000
(20000, 1000)

julia> x = rand(Int64).*rand(Float64, (n,m));

julia> y = rand(Float64, (n,1));

julia> @time train_func(x,y); @time train_func_2(x,y); @time train_func_3(x,y);
  3.946035 seconds (30 allocations: 648.657 MiB, 0.31% gc time)
  0.247796 seconds (9 allocations: 15.771 MiB)
  7.486294 seconds (6.03 k allocations: 168.779 MiB, 0.51% gc time)

Runtime details:

julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
1 Like

Oops, that looks like a terrible counter example to what we usually trust…

using LinearAlgebra
using BenchmarkTools

function train_func(x_train, y_train)
    return pinv(x_train)*y_train
end;

function train_func_2(x_train, y_train)
    # this is assuming n_rows > n_columns
    return inv(transpose(x_train)*x_train)*transpose(x_train)*y_train
end;

function train_func_3(x_train, y_train)
    return x_train\y_train
end;

n, m = 20000, 1000
x = rand(Int64).*rand(Float64, (n,m));
y = rand(Float64, n);
result1 = train_func(x,y); 
result2 = train_func_2(x,y); 
@assert result2 ≈ result1
result3 = train_func_3(x,y);
@assert result3 ≈ result2

@btime train_func($x,$y); 
@btime train_func_2($x,$y); 
@btime train_func_3($x,$y);

yielding

  2.713 s (30 allocations: 648.66 MiB)
  194.618 ms (9 allocations: 15.77 MiB)
  4.097 s (6031 allocations: 168.78 MiB)

Kudos for bringing this up!

Edit: but luckily this

function train_func_4(x_train, y_train)
    return (transpose(x_train)*x_train)\(transpose(x_train)*y_train)
end;

yielding

  125.174 ms (7 allocations: 15.28 MiB)

repairs my trust again…

2 Likes

Here you go Efficient way of doing linear regression - #33 by stevengj

Thanks a lot @zdenek_hurak for sharing the better implementation and the link to the right thread.

1 Like