How do I lessen memory allocation and speedup block inverse?


I was trying to implement block inverses in Julia and noticed that in some cases it’s faster than the builtin inv() but it also allocates way more memory. Is there any way to improve this function?

This is my implementation:

julia> versioninfo()
Julia Version 1.5.0-DEV.416
Commit 178ac974b5 (2020-03-06 21:33 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

julia> using LinearAlgebra

julia> using BenchmarkTools

julia> function block_inv(A)
           rows,cols = size(A)
           k = A[1:Int(rows/2), 1:Int(cols/2)]
           l = A[1:Int(rows/2), Int(cols/2)+1:cols]
           m = A[Int(rows/2)+1:rows, 1:Int(cols/2)]
           n = A[Int(rows/2)+1:rows, Int(cols/2)+1:cols]
           if rows>20 && rows%4 == 0
               k_inv = block_inv(k)
               dcab_inv = block_inv(n - m*k_inv*l)
               k_inv = inv(k)
               dcab_inv = inv(n - m*k_inv*l)
           P = [k_inv+k_inv*l*dcab_inv*m*k_inv  -k_inv*l*dcab_inv; -1*dcab_inv*m*k_inv dcab_inv]
block_inv (generic function with 1 method)

julia> A = rand(80,80);

julia> @btime inv(A)
  223.109 μs (6 allocations: 90.98 KiB)
80×80 Array{Float64,2}:

julia> @btime block_inv(A)
  172.456 μs (174 allocations: 545.33 KiB)
80×80 Array{Float64,2}:

julia> A = rand(120,120);

julia> @btime inv(A)
  496.763 μs (6 allocations: 173.83 KiB)
120×120 Array{Float64,2}:

julia> @btime block_inv(A)
  411.599 μs (194 allocations: 1.16 MiB)
120×120 Array{Float64,2}:

julia> isapprox(inv(A), block_inv(A))

You get allocations from:

  • Not using views for k, l, m, n
  • Doing the matrix mutliplication and subtraction in - m*k_inv*l
  • The call to inv.
  • Creating the output P.

You just have to figure out how to share memory between the computations / through the recursion. There isn’t much Julia specific about this.


Also, minor nitpick: integer division can be done using div.