Hello, this is about a recursive implementation of the `sqrtm`

algorithm. The idea is that, instead of finding the solution `X`

to `X^2 = A`

on a pointwise basis, calculating each entry starting with the diagonal and working outwards, `A`

is split up into `[A11 A12; 0 A22]`

, smaller solutions `X11, X22`

are found as before, and `X12`

is the result of the Sylvester equation. This follows Deadman Higham Ralha and should produce a sizeable speedup. … But it’s not as fast as I would like it to be.

**Any improvements that can be made to recsqrtm? Also in memory usage**

```
function recsqrtm{T}(A::AbstractArray{T})
S = schurfact(A)
S[:vectors] * _recsqrtm(S[:Schur]) * S[:vectors]'
end
function _recsqrtm{T}(A::AbstractArray{T})
n = Base.LinAlg.checksquare(A)
t = typeof(sqrt(zero(T)))
R = zeros(t,n,n)
@inbounds if n == 1
R[1,1] = sqrt(A[1,1])
elseif n == 2
R[1,1] = r11 = sqrt(A[1,1])
R[2,2] = r22 = sqrt(A[2,2])
R[1,2] = A[1,2] / (r11 + r22)
else
n1 = Int(ceil(n/2))
r1 = 1:n1
r2 = (n1+1):n
R[r1,r1] = _recsqrtm(A[r1,r1])
R[r2,r2] = _recsqrtm(A[r2,r2])
R[r1,r2] = sylvester(R[r1,r1],R[r2,r2],-A[r1,r2])::Matrix{t}
end
R
end
```

From `ProfileView`

most of the time is spent in `trsyl!`

.

**How does one calibrate this kind of thing?** See below for block-wise recursive code. How does one choose an appropriate block size? The block size (the point at which one reverts to the pointwise `sqrtm`

method) doesn’t seem to matter much, whether `1`

(as above, in `recsqrtm`

) or `64`

as below. Is there a standard set of matrices/matrix types to test against, or systems to test on? Looks to me like recursive performance is an improvement from matrices of dimension 1000 - 2000, so should that be hard-coded? What about parallel computation?

```
function _blksqrtm{T}(A::AbstractArray{T})
n = Base.LinAlg.checksquare(A)
@inbounds if n < 65
return sqrtm(UpperTriangular(A),Val{true})
else
t = typeof(sqrt(zero(T)))
R = zeros(t,n,n)
n1 = Int(ceil(n/2))
if !iszero(A[n1+1,n1]) # hit a 2x2-block on diagonal, avoid
n1 -= 1
end
r1 = 1:n1
r2 = (n1+1):n
R[r1,r1] = _blksqrtm(A[r1,r1])
R[r2,r2] = _blksqrtm(A[r2,r2])
R[r1,r2] = sylvester(R[r1,r1],R[r2,r2],-A[r1,r2])::Matrix{t}
end
R
end
```