# Recursive-blocked matrix square root

#1

Hello, this is about a recursive implementation of the `sqrtm` algorithm. The idea is that, instead of finding the solution `X` to `X^2 = A` on a pointwise basis, calculating each entry starting with the diagonal and working outwards, `A` is split up into `[A11 A12; 0 A22]`, smaller solutions `X11, X22` are found as before, and `X12` is the result of the Sylvester equation. This follows Deadman Higham Ralha and should produce a sizeable speedup. … But it’s not as fast as I would like it to be.

Any improvements that can be made to `recsqrtm`? Also in memory usage

``````function recsqrtm{T}(A::AbstractArray{T})
S = schurfact(A)
S[:vectors] * _recsqrtm(S[:Schur]) * S[:vectors]'
end
function _recsqrtm{T}(A::AbstractArray{T})
n = Base.LinAlg.checksquare(A)
t = typeof(sqrt(zero(T)))
R = zeros(t,n,n)

@inbounds if n == 1
R[1,1] = sqrt(A[1,1])
elseif n == 2
R[1,1] = r11 = sqrt(A[1,1])
R[2,2] = r22 = sqrt(A[2,2])
R[1,2] = A[1,2] / (r11 + r22)
else
n1 = Int(ceil(n/2))
r1 = 1:n1
r2 = (n1+1):n
R[r1,r1] = _recsqrtm(A[r1,r1])
R[r2,r2] = _recsqrtm(A[r2,r2])
R[r1,r2] = sylvester(R[r1,r1],R[r2,r2],-A[r1,r2])::Matrix{t}
end
R
end
``````

From `ProfileView` most of the time is spent in `trsyl!`.

How does one calibrate this kind of thing? See below for block-wise recursive code. How does one choose an appropriate block size? The block size (the point at which one reverts to the pointwise `sqrtm` method) doesn’t seem to matter much, whether `1` (as above, in `recsqrtm`) or `64` as below. Is there a standard set of matrices/matrix types to test against, or systems to test on? Looks to me like recursive performance is an improvement from matrices of dimension 1000 - 2000, so should that be hard-coded? What about parallel computation?

``````function _blksqrtm{T}(A::AbstractArray{T})
n = Base.LinAlg.checksquare(A)

@inbounds if n < 65
return sqrtm(UpperTriangular(A),Val{true})
else
t = typeof(sqrt(zero(T)))
R = zeros(t,n,n)
n1 = Int(ceil(n/2))
if !iszero(A[n1+1,n1]) # hit a 2x2-block on diagonal, avoid
n1 -= 1
end
r1 = 1:n1
r2 = (n1+1):n
R[r1,r1] = _blksqrtm(A[r1,r1])
R[r2,r2] = _blksqrtm(A[r2,r2])
R[r1,r2] = sylvester(R[r1,r1],R[r2,r2],-A[r1,r2])::Matrix{t}
end
R
end
``````

#2

Try putting `@views` in front of `function` (requires Compat in Julia 0.5).