Some eigenpairs from a large, sparse, nonsymmetric matrix: Julia vs Matlab

PeterSimon · January 31, 2023, 12:17am

Thanks for the detailed explanation of how to implement the shift-invert operation (which I needed!). My code for implementing this is in the file rerun_KrylovKit show below:

using KrylovKit: eigsolve
using LinearAlgebra
PQ = P*Q

function rerun_KrylovKit(n=1, PQ=PQ)
    Threads.nthreads() == n || error("Restart Julia with -t$n")
    BLAS.set_num_threads(n)
    println("KrylovKit  $n Threads:")
    @time begin
        LUfactors = lu(PQ - ev*I)
        fun(x) = LUfactors \ x
        D2, Exyvecs, info = eigsolve(fun, rand(eltype(PQ), size(PQ,2)), NMODES, :LM; tol=1e-14)
    end
    info.converged ≥ NMODES || warning("$(info.converged) converged modes, but $(NMODES) requested")
    D2 = inv.(D2[1:NMODES]) .+ ev
    @show D2
    print(info)
end

Running this on single-threaded Julia produces the following output:

julia> rerun_KrylovKit(1)
KrylovKit  1 Threads:
  2.402129 seconds (92.02 k allocations: 423.060 MiB, 0.77% gc time, 3.03% compilation time)
D2 = ComplexF64[-11.41757523508308 - 0.0im, -11.34764322859855 - 0.0im, -11.175322773264579 - 0.0im]
ConvergenceInfo: 5 converged values after 2 iterations and 42 applications of the linear map;
norms of residuals are given by (4.651212075017882e-28, 1.1311489902029288e-25, 1.1303118924642023e-20, 6.12279637828794e-19, 3.090439836645266e-16).

The results are much improved over my initial try, with convergence achieved and execution time competitive with Arpack. I also tried running it with 8 threads and with MKLSparse:

using MKL, MKLSparse
julia> using MKL, MKLSparse

julia> rerun_KrylovKit(8)
KrylovKit  8 Threads:
  2.364362 seconds (1.49 k allocations: 418.421 MiB, 0.80% gc time)
D2 = ComplexF64[-11.41757523508308 - 0.0im, -11.34764322859856 - 0.0im, -11.175322773264574 - 0.0im]
ConvergenceInfo: 5 converged values after 2 iterations and 42 applications of the linear map;
norms of residuals are given by (4.367435621113895e-28, 1.0519282546009974e-25, 1.0801634314205764e-20, 5.526478130311956e-19, 2.822258996060205e-16).

Almost no difference in run time–the difference is within the variation seen for multiple runs.

I went through a similar exercise with ArnoldiMethod, using the LinearMap wrapper shown in the package’s documentation. The contents of rerun_ArnoldiMethod.jl are shown below:

using ArnoldiMethod, LinearAlgebra, LinearMaps

PQ = P*Q

# Factorizes A and builds a linear map that applies inv(A) to a vector.
function construct_linear_map(A)
    F = factorize(A - ev*I)
    LinearMap{eltype(A)}((y, x) -> ldiv!(y, F, x), size(A,1), ismutating=true)
end

function rerun_ArnoldiMethod(n=1, PQ=PQ)
    Threads.nthreads() == n || error("Restart Julia with -t$n")
    BLAS.set_num_threads(n)
    println("ArnoldiMethod $n Threads:")
    @time begin
        decomp, history = partialschur(construct_linear_map(PQ), nev=NMODES, tol=1e-14, restarts=300, which=LM())
        D2, Exy = partialeigen(decomp)
    end
    println(history)
    D2 = inv.(D2) .+ ev
    @show D2
end

Running this on a single thread produces the following output:

julia> include("rerun_ArnoldiMethod.jl")
rerun_ArnoldiMethod (generic function with 3 methods)

julia> rerun_ArnoldiMethod(1)
ArnoldiMethod 1 Threads:
  2.216459 seconds (1.69 M allocations: 447.162 MiB, 0.94% gc time, 10.12% compilation time)
Converged: 3 of 3 eigenvalues in 38 matrix-vector products
D2 = [-11.175322773264579, -11.347643228598555, -11.41757523508308]
3-element Vector{Float64}:
 -11.175322773264579
 -11.347643228598555
 -11.41757523508308

and with 8 threads (restart of Julia with 8 threads not shown):

julia> using MKL, MKLSparse

julia> rerun_ArnoldiMethod(8)
ArnoldiMethod 8 Threads:
  1.997144 seconds (240 allocations: 367.015 MiB, 0.76% gc time)
Converged: 3 of 3 eigenvalues in 38 matrix-vector products
D2 = [-11.17532277326458, -11.347643228598553, -11.41757523508308]
3-element Vector{Float64}:
 -11.17532277326458
 -11.347643228598553
 -11.41757523508308

so the 8-threaded run with MKLSparse is about 10% faster than the single-threaded run.

Here is an updated summary of the timings with and without multi-threading. “With” also includes use of MKL and MKLSparse for the Julia results.

Package	Function	1 Thread Time (s)	8 Threads Time (s)
Matlab	eigs	1.0	1.0
Arpack	eigs	2.6	2.8
KrylovKit	eigsolve	2.4	2.4
ArnoldiMethod	partialschur, partialeigen	2.2	2.0

I only reported the times to the nearest tenth of a second because they vary by almost that much from run to run (due to random starting vectors?). Only the ArnoldiMethod routines show even a modest reduction in execution time with threading and use of MKLSparse. Arpack actually got slightly worse. So far, the shortest time I can get from Julia is using ArnoldiMethod which takes about twice as long as eigs in Matlab.

Topic		Replies	Views
Julia is slower than matlab when it comes to matrix diagonalization? General Usage question , eigs , eigenvalue-problem	39	1699	January 14, 2024
Julia is slower than MATLAB at diagonalizing matrices New to Julia question , matlab , linearalgebra , matrices , eigenvalues	41	4408	March 30, 2022
Eigenvalue problem of sparse matrix General Usage	7	6318	December 1, 2016
Smallest magnitude eigenvalues of the generalized eigenvalue equation for a large sparse matrix General Usage question , eigenvalue , arpack	20	3095	February 2, 2022
Accelerating repeated eigenvalue searches Numerics linear-algebra , eigenvalues	31	1011	June 14, 2024

Some eigenpairs from a large, sparse, nonsymmetric matrix: Julia vs Matlab

Related topics