Hello!

I was trying to implement block inverses in Julia and noticed that in some cases it’s faster than the builtin inv() but it also allocates way more memory. Is there any way to improve this function?

This is my implementation:

```
julia> versioninfo()
Julia Version 1.5.0-DEV.416
Commit 178ac974b5 (2020-03-06 21:33 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
julia> using LinearAlgebra
julia> using BenchmarkTools
julia> function block_inv(A)
rows,cols = size(A)
k = A[1:Int(rows/2), 1:Int(cols/2)]
l = A[1:Int(rows/2), Int(cols/2)+1:cols]
m = A[Int(rows/2)+1:rows, 1:Int(cols/2)]
n = A[Int(rows/2)+1:rows, Int(cols/2)+1:cols]
if rows>20 && rows%4 == 0
k_inv = block_inv(k)
dcab_inv = block_inv(n - m*k_inv*l)
else
k_inv = inv(k)
dcab_inv = inv(n - m*k_inv*l)
end
P = [k_inv+k_inv*l*dcab_inv*m*k_inv -k_inv*l*dcab_inv; -1*dcab_inv*m*k_inv dcab_inv]
end
block_inv (generic function with 1 method)
julia> A = rand(80,80);
julia> @btime inv(A)
223.109 μs (6 allocations: 90.98 KiB)
80×80 Array{Float64,2}:
julia> @btime block_inv(A)
172.456 μs (174 allocations: 545.33 KiB)
80×80 Array{Float64,2}:
julia> A = rand(120,120);
julia> @btime inv(A)
496.763 μs (6 allocations: 173.83 KiB)
120×120 Array{Float64,2}:
...
julia> @btime block_inv(A)
411.599 μs (194 allocations: 1.16 MiB)
120×120 Array{Float64,2}:
...
julia> isapprox(inv(A), block_inv(A))
true
```