# KrylovKit convergence

Hello,

I have just started learning Julia, and I am attempting to use it to solve a one-dimensional eigenvalue problem in my field, plasma physics (specifically, the tearing mode, if anyone is interested). The problem involves a very thin boundary layer around the center of the domain, which might be the source of the difficulties I’m having.

Since I’m only interested in the growing mode, i.e. the eigenvalue of the problem with the largest real part, after looking around I thought to use KrylovKit to do this. However, it seems to have quite a bit of trouble converging, even when my resolution is sufficient to get a good solution using brute force (just using Julia’s eigen()). Here is my code:

f(x)=tanh(x)
ddf(x)=-2*tanh(x)*(sech(x))^2

function vargrid(dxmin,xlim,eps)
if eps > 0.0
n = convert(Int64,floor(log(xlim*eps/dxmin+1)/log((1+eps))))+1
else
n = convert(Int64,xlim/dxmin)+1
end

dxmax = dxmin*(1+eps)^n
x=zeros(2*n-1)
dx=zeros(2*n)
dx[n+1]=dxmin
dx[n]=dxmin
for i in (n+1):2*n-1
dx[i+1]=dx[i]*(1+eps)
x[i]=x[i-1]+dx[i]
end
for i in n-1:-1:1
dx[i]=dx[2*n+1-i]
x[i]=-x[2*n-i]
end

return x,dx
end

function tearingVG(S,K,ar,dxmin,xlim,eps)

x,dx=vargrid(dxmin,xlim,eps)

N=length(x)
DD=zeros(N,N)
for i in 1:N, j in 1:N
i-j==0 && ( DD[i,j] = -2.0*(1.0+dx[i+1]/dx[i])/(dx[i+1]^2+dx[i]*dx[i+1]) )
(i-j)==-1 && ( DD[i,j] = 2.0/(dx[i+1]^2+dx[i]*dx[i+1]) )
(i-j)==1 && ( DD[i,j] = 2.0*(dx[i+1]/dx[i])/(dx[i+1]^2+dx[i]*dx[i+1]) )
end

DD[1,1]=DD[1,1]+2.0*((dx/dx)/(1+K*dx))/(dx^2+dx*dx)
DD[N,N]=DD[N,N]+2.0/(1+K*dx[N+1])/(dx[N+1]^2+dx[N]*dx[N+1])

L=DD-I*K^2
A=zeros(2*N,2*N)
B=zeros(2*N,2*N)
A=complex(A)
B=complex(B)
B[1:N,1:N]=1.0*I(N)
B[N+1:end,N+1:end]=L
A[1:N,1:N]=L/S - im*ar*diagm(f.(x))*K
A[1:N,N+1:end]=im*diagm(f.(x))*K
A[N+1:end,1:N]=im*diagm(f.(x))*K*L-im*diagm(ddf.(x))*K
A[N+1:end,N+1:end]=-im*ar*diagm(f.(x))*K*L+im*ar*diagm(ddf.(x))*K

return eigen(A,B)

end

function tearing_KrylovKit(S,K,ar,dxmin,xlim,eps)

x,dx=vargrid(dxmin,xlim,eps)

N=length(x)
DD=zeros(N,N)
for i in 1:N, j in 1:N
i-j==0 && ( DD[i,j] = -2.0*(1.0+dx[i+1]/dx[i])/(dx[i+1]^2+dx[i]*dx[i+1]) )
(i-j)==-1 && ( DD[i,j] = 2.0/(dx[i+1]^2+dx[i]*dx[i+1]) )
(i-j)==1 && ( DD[i,j] = 2.0*(dx[i+1]/dx[i])/(dx[i+1]^2+dx[i]*dx[i+1]) )
end

DD[1,1]=DD[1,1]+2.0*((dx/dx)/(1+K*dx))/(dx^2+dx*dx)
DD[N,N]=DD[N,N]+2.0/(1+K*dx[N+1])/(dx[N+1]^2+dx[N]*dx[N+1])

L=DD-I*K^2
A=zeros(2*N,2*N)
B=zeros(2*N,2*N)
A=complex(A)
B=complex(B)
B[1:N,1:N]=1.0*I(N)
B[N+1:end,N+1:end]=L
A[1:N,1:N]=L/S - im*ar*diagm(f.(x))*K
A[1:N,N+1:end]=im*diagm(f.(x))*K
A[N+1:end,1:N]=im*diagm(f.(x))*K*L-im*diagm(ddf.(x))*K
A[N+1:end,N+1:end]=-im*ar*diagm(f.(x))*K*L+im*ar*diagm(ddf.(x))*K
C=inv(B)*A

return eigsolve(C,1,:LR,maxiter=1000)

end


vargrid sets up the grid, while the other two functions use either eigen or KrylovKit’s eigsolve to solve the problem. Typical values are
S ~ 10^4-10^8
K ~ 1-10^-4
ar=0 (for now)
dxmin=10^-3 - 10^-5
xlim=10
eps=0.02

The typical values of the eigenvalue I’m looking for are small but positive and real, while typically all other eigenvalues have negative real parts. I am not experienced with Krylov methods, so any advice would be greatly appreciated; I thought that this would work quite well since the required eigenvalue is on the edge of the spectrum, but I’m sure there is some subtlety I’m overlooking. Thanks in advance!

It looks like there a lot of nearby eigenvalues (right around 0), which is probably why it’s converging slowly even though the desired eigenvalue is on the edge of the spectrum.

However you should be able to make the iterations enormously faster. Your B matrix is tridiagonal, and your A matrix is sparse. When you compute C = inv(B) \ A explicitly you get a dense matrix, but to use an iterative method you only need to provide a function x -> C*x, which can be computed in O(N) time (orders of magnitude faster) by x -> B \ (A*x), provided that you:

• Store B in a Tridiagonal structure — you only need to supply the diagonal and upper/lower diagonal values. (Tridiagonal has a very fast B \ vector implementation.)
• Store A in a generic sparse matrix data structure (or just write a matrix-free routine to multiply a vector by A, which can be even more efficient because A has such a regular structure).
• Never allocate any N \times N dense matrix. (Your code, in contrast, allocates a huge number of these just to assemble A and B, even putting aside the cost of explicitly inverting B and forming C.)

(If you had some estimate µ for the eigenvalue you want then you could potentially use shift-and-invert to accelerate things. Again, though, you would not explicitly invert any matrix, but rather do sparse solves. Almost never invert matrices, especially sparse ones!)

5 Likes

Thanks a lot for this! I replaced the functions with the following

function tearing_sparse(S,K,ar,dxmin,xlim,eps)

x,dx=vargrid(dxmin,xlim,eps)

N=length(x)
ld=2.0*(dx[2:end]./dx[1:end-1])./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
ud=2.0./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
d=-2.0*(1.0.+dx[2:end]./dx[1:end-1])./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
DD=Tridiagonal(ld[2:end],d,ud[1:end-1])
DD[1,1]=DD[1,1]+2.0*((dx/dx)/(1+K*dx))/(dx^2+dx*dx)
DD[end,end]=DD[end,end]+2.0/(1+K*dx[N+1])/(dx[N+1]^2+dx[N]*dx[N+1])

L=DD-I*K^2
A=complex(spzeros(2*N,2*N))
B=complex(Tridiagonal(fill(0.0,2*N-1),fill(1.0,2*N),fill(0.0,2*N-1)))
B[N+1:end,N+1:end]=L
A[1:N,1:N]=L/S - im*ar*diagm(f.(x))*K
A[1:N,N+1:end]=im*diagm(f.(x))*K
A[N+1:end,1:N]=im*diagm(f.(x))*K*L-im*diagm(ddf.(x))*K
A[N+1:end,N+1:end]=-im*ar*diagm(f.(x))*K*L+im*ar*diagm(ddf.(x))*K

return eigen(B \ A)

end

function tearing_KK_sparse(S,K,ar,dxmin,xlim,eps)

x,dx=vargrid(dxmin,xlim,eps)

N=length(x)
ld=2.0*(dx[2:end]./dx[1:end-1])./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
ud=2.0./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
d=-2.0*(1.0.+dx[2:end]./dx[1:end-1])./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
DD=Tridiagonal(ld[2:end],d,ud[1:end-1])
DD[1,1]=DD[1,1]+2.0*((dx/dx)/(1+K*dx))/(dx^2+dx*dx)
DD[end,end]=DD[end,end]+2.0/(1+K*dx[N+1])/(dx[N+1]^2+dx[N]*dx[N+1])

L=DD-I*K^2
A=complex(spzeros(2*N,2*N))
B=complex(Tridiagonal(fill(0.0,2*N-1),fill(1.0,2*N),fill(0.0,2*N-1)))
B[N+1:end,N+1:end]=L
A[1:N,1:N]=L/S - im*ar*diagm(f.(x))*K
A[1:N,N+1:end]=im*diagm(f.(x))*K
A[N+1:end,1:N]=im*diagm(f.(x))*K*L-im*diagm(ddf.(x))*K
A[N+1:end,N+1:end]=-im*ar*diagm(f.(x))*K*L+im*ar*diagm(ddf.(x))*K
return eigsolve(x->B\(A*x),rand(ComplexF64,2*N),1,:LR,maxiter=10000)

end


Both are much faster: around 5x faster in both cases for the numbers I gave previously. However, the time to converge for the version using the iterative method is still much longer than the “brute force” method.

By shift-and-invert, do you mean find the eigenvalues of (C-\mu I)^{-1} rather than C, to spread out the closely packed eigenvalues? If I just choose \mu=0, would that be a good idea? In this example, would I just use

x-> A \(B*x)


in the argument of eigsolve? Thanks very much again.
Edit: Of course, I can’t use \mu=0 because this is in fact already an eigenvalue - so some small number between 0 and the true eigenvalue would be best, maybe.

You’re still allocating dense matrices. e.g. diagm(f.(x)) is a dense matrix. Then you multiply by im and allocate another dense matrix. Then you multiply by K and allocate another dense matrix. Then you store this in A[1:N, N+1:end] and finally throw away the zeros. My advice would be to learn how to use constructors like sparse and spdiagm directly, rather than allocating and then assigning into an array. (Similarly, you could assign B using a single B = Tridiagonal(...) call simply by passing the right vectors, rather than allocating, converting into a complex array (making yet another copy), and then assigning into it.)

No, because your matrix has zero eigenvalues that that would (a) be non-invertible and (b) cause the iterative solver to favor the eigenvalues closest to zero. You can only use shift-and-invert if you have an estimate μ that is closer to your desired eigenvalue than to any other eigenvalue. e.g. if you know an upper bound for Re(λ).

Another thing you can try is to play with the Arnoldi parameters, e.g. you can try to increase krylovdim to see if that speeds convergence.

OK, thanks for this advice - I replaced the relevant parts of the code with this

    ld=2.0*(dx[2:end]./dx[1:end-1])./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
ud=2.0./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
d=-2.0*(1.0.+dx[2:end]./dx[1:end-1])./(dx[2:end].^2+dx[1:end-1].*dx[2:end])
d=d+2.0*((dx/dx)/(1+K*dx))/(dx^2+dx*dx)
d[end]=d[end]+2.0/(1+K*dx[N+1])/(dx[N+1]^2+dx[N]*dx[N+1])

A=spzeros(ComplexF64,2*N,2*N)
B_ud=vcat(zeros(ComplexF64,N),ud[1:end-1])
B_ld=vcat(zeros(ComplexF64,N),ld[2:end])
B_d=vcat(ones(ComplexF64,N),d.-K^2)
B=Tridiagonal(B_ld,B_d,B_ud)
I_11=vcat(2:N,1:N,1:N-1)
J_11=vcat(1:N-1,1:N,2:N)
A_11=vcat(ld[2:N]/S,(d.-K^2)/S -im*ar*K*f.(x),ud[1:N-1]/S)
I_12=1:N
J_12=N+1:2*N
A_12=im*K*f.(x)
I_21=vcat(N+2:2*N,N+1:2*N,N+1:2*N-1)
J_21=vcat(1:N-1,1:N,2:N)
A_21=vcat(im*K*f.(x[2:N]).*ld[2:N],im*K*f.(x).*(d.-K^2)-im*K*ddf.(x),im*K*f.(x[1:N-1]).*ud[1:N-1])
I_22=vcat(N+2:2*N,N+1:2*N,N+1:2*N-1)
J_22=vcat(N+1:2*N-1,N+1:2*N,N+2:2*N)
A_22=vcat(-im*ar*K*f.(x[2:N]).*ld[2:N],-im*ar*K*f.(x).*(d.-K^2)+im*ar*K*ddf.(x),-im*ar*K*f.(x[1:N-1]).*ud[1:N-1])
Imap=vcat(I_11,I_12,I_21,I_22)
Jmap=vcat(J_11,J_12,J_21,J_22)
Amap=vcat(A_11,A_12,A_21,A_22)
A=dropzeros(sparse(Imap,Jmap,Amap))


without much difference in performance: but such things will definitely be very important for some future projects.

In general (i.e. for any values of the parameters that might be interesting), a (very loose) upper bound is Re(\lambda)<1. I tried using this, replacing the eigsolve as in the previous post with

eigsolve(x->(A-B)\(B*x),rand(ComplexF64,2*N),1,:LR,maxiter=10000)


which converges extremely quickly but doesnt seem to pick out the right eigenvalue 1/(\lambda -1). Am I missing something obvious? Thanks again.

EDIT: I was indeed missing something obvious. I should have changed to :SR in the arguments of the eigsolve. It appears to work really well, so thanks a lot!

Since the eigenvalues of (A-B)^{-1} B are 1/(\lambda - 1), the \lambda with the largest real part (but still < 1) will now become the 1/(\lambda - 1) with the smallest (most negative) real part, so now you should pass :SR instead of :LR.

(It may not help much though, if you are trying to distinguish an eigenvalue of 0.02 from an eigenvalue of nearly 0, since those map to -1.0204 and -1, respectively, which are still quite close. Worth a shot anyway.)

Thanks - I found this and was editing my post above. This is now extremely fast to converge, so problem solved!

Update, in case anyone is interested: a better first guess for the eigenvalue is S^{-1/2}, which helps it converge even faster. Onwards and upwards…