How to mitigate bad performance of eigvals from LinearAlgebra

Paradox · December 31, 2023, 4:44pm

Dear all,

I am trying to rewrite my Matlab code that calculates certain band structure to Julia. Unfortunately, I am unable to get close to the performance of Matlab code for some reason that I don’t understand. I will be very thankful if you help me.

My Julia code is the following:

#assert Threads.nthreads() == 8
using BenchmarkTools
using FFTW
using MKL
using LinearAlgebra
LinearAlgebra.BLAS.set_num_threads(1);
FFTW.set_provider!("mkl")

function Hamiltonian(nk, a, v)
    
    nx = length(v);
    E = zeros(ComplexF64, nk);
    K = 2*π*collect(fftshift(fftfreq(nx, nx)))
    Hf = zeros(ComplexF64, nx, nx);
    k = collect(range(-π, π, nk));

    fft!(v)
    fftshift!(v, v/length(v))
    
    for j=1:nx, i=1:nx
        idx = i - j + floor(Int64, nx/2 + 1)
        if 0 < idx <= nx
            @inbounds Hf[i, j] = v[idx];
        end
    end
    Threads.@threads for i in eachindex(k)
        e = eigvals(Hermitian(Diagonal(-0.5.*abs.(k[i] .+ K).^a) .+ Hf));
        sort!(e, lt = (x, y) -> abs(real(x)) < abs(real(y)));
        @inbounds E[i] = e[1];
    end

    return E
    
end

x = range(0, 1, 1001);
@btime H = Hamiltonian(100, 1.5, complex(100*cos.(2*pi*x).*sinh.(-x)));

Julia version is 1.10. I am using MKL explicitly because Matlab uses MKL on intel processor and it does give better performance on my system. I have turned off auto parallelism of MKL because parallelism is already achieved by Threads.@threads on outer for loop, and again it gives better performance. I have also tried @distrbuted and got roughly the same results. This code is around 50% slower than its Matlab 2023a counter part:

function e = Hamiltonian(nk, a, p)

p = fftshift(fft(p))/length(p);
k = linspace(-pi, pi, nk);
L = length(p);
m = fix(L/2);
M = -m:m;
K = 2*pi*M;
Hf = zeros(L, L);

[i, j] = meshgrid(1:L, 1:L);
mask = round(i - j + floor(L/2));
Hf(mask >= 0 & mask < L) = p(mask(mask >= 0 & mask < L)+1);

e = zeros(1, nk);
parfor i=1:nk
    H = diag(-0.5*abs(k(i) + K).^a) + Hf;
    E = eig(H, 'vector');
    [~, m] = sort(abs(real(E)));
    E = E(m);
    e(i) = real(E(1));
end

tic
x = linspace(0, 1, 1001);
e = Hamiltonian(100, 1.5, 100*cos(2*pi*x).*sinh(-x));
toc

The performance of these two are roughly the same for smaller matrices <500x500 size. But as matrix matrix becomes larger, I see huger gap. In general the potential term p,v (third argument of Hamiltonian) is Complex, so I use fft not rfft. Also I know performance wise the whole function is irrelevant, only the part eigenvalues are calculated causes bad performance. Moreover these two functions are completely equivalent (i.e eig solver in both cases sees the same matrix and produce same results).

quantumtwist · January 1, 2024, 12:08am

Hmm, quite confusing! Virtually all the time is spent in diagonalization, so perhaps this could be an MKL versioning issue?

Could you check if the problem persists with a more minimal example such as the following:


function FakeHamiltonian(Nx,Nk)
    ks = 1:Nk
    E = zeros(Float64,Nk)
    Threads.@threads for i in eachindex(ks)
        H = rand(ComplexF64,Nx,Nx)
        e = eigvals(Hermitian(H))
        sort!(e, by=abs)
        E[i] = e[1];
    end
    return E
end

Nx = 1001
Nk = 32
@benchmark ParallelFakeHamiltonian($Nx,$Nk)

Benny · January 1, 2024, 12:24am

No idea if this would affect timings, but a BenchmarkTools practice is to interpolate untyped globals like 2*pi*$x and sinh.(-$x) into the benchmark loop made by the macro, in order to avoid propagating type inference problems that a normal call wouldn’t.

Paradox · January 1, 2024, 1:11am

so perhaps this could be an MKL versioning issue?

Right on the mark! Even though it was a fresh installation of Julia 1.10 along its packages, for some baffling reason, MKL version was 0.3.0. (Currently it is 0.6.1) and couldn’t be updated, until I removed everything and installed them again. Who would’ve thought that an old version of MKL would make such huge performance gap!

As for minimal example, I couldn’t use random numbers, because in that case the random numbers seen by Matlab would be different from Julia, and eig solvers would’ve not been exactly comparable. At any rate, thanks.

Topic		Replies	Views
Unusually bad performance of eigen() compared to eig() in Matlab Performance question	16	3136	July 17, 2019
Julia versus MATLAB Performance matlab , linearalgebra	5	2102	April 12, 2022
Performance Issues - Rayleigh Compressible Performance question	3	669	August 23, 2020
Julia is slower than MATLAB at diagonalizing matrices New to Julia question , matlab , linearalgebra , matrices , eigenvalues	41	4404	March 30, 2022
Julia is slower than matlab when it comes to matrix diagonalization? General Usage question , eigs , eigenvalue-problem	39	1697	January 14, 2024

How to mitigate bad performance of eigvals from LinearAlgebra

Related topics