slow performance of eig

itzui · August 9, 2018, 9:48am

Hello,

the performance of the determination of the eigenvalues and the eigenvector for small complex matrices in Julia seems for me quite slow. If I use the following code

N = 100000
dim = 3
function eigen()
for i = 1:N
A = rand(dim, dim) + im * rand(dim, dim)
eig(A)
end
end
@time eigen()

it takes about 3sec.

3.226431 seconds (5.76 M allocations: 436.489 MiB, 2.34% gc time)

If I use the same code in Python (with numpy) the code is slighty slower (4.4s), however by using numba I can speed up the loop and it takes then only 0.7s which is much faster than Julia.

Is there a workaround or package to enhance the performance for these matrices?

PS: versioninfo in Julia gives the following
Julia Version 0.6.2
Commit d386e40c17* (2017-12-13 18:08 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core™ i5-6500 CPU @ 3.20GHz
WORD_SIZE: 64
BLAS: mkl_rt
LAPACK: mkl_rt
LIBM: libopenlibm
LLVM: libLLVM-3.9.1 (ORCJIT, skylake)

ufechner7 · August 9, 2018, 3:06pm

Most likely you are benchmarking the performance of rand() and not of eig(). Did you try to pre-calculate the random arrays and @time only the eig() calculations?

Tero_Frondelius · August 9, 2018, 5:04pm

Can you try BenchmarkTools? For me it looks like you are measuring compile time.

DNF · August 9, 2018, 5:19pm

If you want to benchmark eig, you should do that without mixing in the performance of rand, and also avoid using global variables (N and dim are globals), which violates the “first guideline of Julia benchmarking”.

Here’s the “right” way:

using BenchmarkTools
A = rand(Complex{Float64}, dim, dim)
@btime eig($A)

ufechner7 · August 9, 2018, 5:21pm

Well, if you define them as const this should also be fine:
const N=100000
const dim=3

itzui · August 10, 2018, 6:56am

Dear All,

thanks a lot for the comments and help. If I use BenchmarkTools and define the matrix before the loop I can reduce the computation time by 1s. But its still about 3 times slower than using numpy + numba.

The code I used now is the following:

using BenchmarkTools
const mat = rand(3, 3) + im * rand(3, 3)
@btime for i = 1:100000 eig(mat) end;

The output is then:
2.132 s (3800000 allocations: 289.92 MiB)

mohamed82008 · August 10, 2018, 7:21am

You don’t need the loop since btime will take care of this for you and you should interpolate the variables in a btime expression as such @btime eig($mat).

itzui · August 10, 2018, 7:28am

Thanks. Sorry, I’m still a beginner in Julia. If I remove the loop I’ll end up with 20µs for one call. But still, for 10^5 calls I need 2s. So there is only a small enhancement of the time. The main result, that its much slower than numpy+numba, is not changing.

DNF · August 10, 2018, 8:48am

Note that this creates several unnecessary temporary arrays. You should do this instead:

rand(Complex{Float64}, 3, 3)

which creates just one array.

Yes, the speedups from better benchmarking was not as dramatic as expected. Can you provide the python code you use for benchmarking? I am seeing a factor of 2x difference. Perhaps an issue should be opened.

Python:

In [12]: A = np.random.rand(3,3) + 1j*np.random.rand(3,3)  # I don't know how to do this correctly in Python! :/

In [13]: %timeit np.linalg.eig(A)
30.4 µs ± 539 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

         @jit
    ...: def myeig(A):
    ...:     return np.linalg.eig(A)

In [27]: %timeit myeig(A)
10.7 µs ± 25.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Julia

julia> A = rand(Complex{Float64}, 3, 3);

julia> @btime eigen($A);
  20.381 μs (35 allocations: 8.80 KiB)

ufechner7 · August 10, 2018, 8:51am

Are you using Python and Numba with MKL or with OpenBLAS?

DNF · August 10, 2018, 9:10am

Are you asking me or @itzui? I’m not sure, my np.__config__ seems to indicate MKL. It’s whatever ships with Anaconda/Python 3.6.

I guess this is all up to the BLAS library then. But I am surprised how slow this all is. Even for 2x2 matrices it runs in 16us, while I would have expected 16 nano seconds, or something. There must be a tremendous overhead due to BLAS.

So I tried StaticArrays, and to my surprise it was no faster. (In fact, StaticArrays gives an error for non-hermitian matrices! But even for hermitian ones it was very slow.)

itzui · August 10, 2018, 10:11am

My code for testing in python was the following (i know that it is optimized):

from numba import njit, prange
import numpy as np
import time

@njit
def test():
for i in prange(0,100000):
A = np.random.rand(3, 3) + 1j * np.random.rand(3,3)
np.linalg.eig(A)
return

As result I got
773 ms ± 8.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

np.__config__ tells me, that MKL is used. But this should also be true for Julia if I interpreted the versioninfo() (BLAS: mkl_rt) correctly

antoine-levitt · August 10, 2018, 10:29am

Static arrays don’t have optimized implementations of complex hermitian eigenproblems, and fall back to regular eig : https://github.com/JuliaArrays/StaticArrays.jl/blob/master/src/eigen.jl, although it possibly shouldn’t be too hard to adapt the existing code? Lapack and Blas libraries are pretty slow for small matrices (newer mkl have a special “fast” mode for such operations, but I’m not sure how fast it is)

stabbles · August 10, 2018, 1:16pm

One of the issues with eigen is that it is type unstable, which is causing overhead for tiny matrices.

julia> @benchmark LinearAlgebra.LAPACK.geevx!('B', 'N', 'V', 'N', copy($A))
BenchmarkTools.Trial: 
  memory estimate:  7.59 KiB
  allocs estimate:  13
  --------------
  minimum time:     6.627 μs (0.00% GC)
  median time:      6.833 μs (0.00% GC)

julia> @benchmark eigen($A)
BenchmarkTools.Trial: 
  memory estimate:  8.98 KiB
  allocs estimate:  38
  --------------
  minimum time:     16.252 μs (0.00% GC)
  median time:      17.885 μs (0.00% GC)

Type instability stems from the fact that it sometimes uses real arithmetic when the eigenvalues and eigenvectors are real, and complex arithmetic otherwise.

Edit: I just realized that the above might look unintuitive when dealing with complex-valued matrices, so here’s an example. When the matrix is Hermitian (A' == A) the eigenvalues are real, and so is the vector that Julia returns. For non-Hermitian matrices vice-versa:

julia> eigen(begin
         A = rand(ComplexF64, 3, 3)
         A + A'
       end).values |> typeof
Array{Float64,1}

julia> eigen(rand(ComplexF64, 3, 3)).values |> typeof
Array{Complex{Float64},1}

itzui · August 14, 2018, 5:39am

Thanks for the comment. Does it mean, that for small matrices it is better to test the type of the matrix and call the LAPACK routine directly?

Topic		Replies	Views
Does eigen( ) get stuck in processing of the results? Performance linearalgebra	10	573	January 27, 2021
Julia is slower than MATLAB at diagonalizing matrices New to Julia question , matlab , linearalgebra , matrices , eigenvalues	41	4408	March 30, 2022
Performance Issues - Rayleigh Compressible Performance question	3	669	August 23, 2020
Julia versus MATLAB Performance matlab , linearalgebra	5	2103	April 12, 2022
Unusually bad performance of eigen() compared to eig() in Matlab Performance question	16	3140	July 17, 2019

slow performance of eig

Related topics