Eigvals faster for ComplexF64 matrices than Float64

I’ve noticed that when diagonalising real symmetric matrices, eigvals may perform faster if the input matrix is complex, i.e. Matrix{ComplexF64} rather than Matrix{Float64}. Here is my test code:

using LinearAlgebra, BenchmarkTools

n = 50             # matrix dimension
F = rand(n, n)     # a random real Float64 matrix
F += F'            # make `F` symmetric
C = ComplexF64.(F) # a copy of `F` stored as a `Matrix{ComplexF64}`

@benchmark eigvals($F)
@benchmark eigvals($C)

For n = 50, the complex matrix is diagonalised ~5 times faster than the real one:

For n = 230, both calculations take the same amount of time, and for larger matrices the complex calculation becomes slower than the real one, as expected.

Why does in the case of small matrices eigvals perform faster for complex matrices?

versioninfo()
Julia Version 1.8.2
Commit 36034abf26 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 Γ— Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_NUM_THREADS = 8
  JULIA_EDITOR = code
3 Likes

I can’t replicate it on my system:

julia> @benchmark eigvals($F)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  114.435 ΞΌs … 963.658 ΞΌs  β”Š GC (min … max): 0.00% … 81.65%
 Time  (median):     117.048 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   119.006 ΞΌs Β±  23.181 ΞΌs  β”Š GC (mean Β± Οƒ):  0.51% Β±  2.34%

  β–β–„β–†β–ˆβ–ˆβ–ˆβ–‡β–†β–…β–„β–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β–‚β–β–                                       β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–†β–†β–†β–†β–†β–†β–…β–…β–…β–„β–…β–…β–†β–„β–„β–†β–„β–…β–„β–ƒβ–„β–„β–…β–†β–ƒβ–‚β–…β–„β–„β–… β–ˆ
  114 ΞΌs        Histogram: log(frequency) by time        144 ΞΌs <

 Memory estimate: 38.70 KiB, allocs estimate: 11.

julia> @benchmark eigvals($C)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  133.879 ΞΌs …  1.093 ms  β”Š GC (min … max): 0.00% … 82.28%
 Time  (median):     138.148 ΞΌs              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   145.713 ΞΌs Β± 41.693 ΞΌs  β”Š GC (mean Β± Οƒ):  1.27% Β±  3.95%

  ▁ β–†β–ˆβ–„β–‚β–   β–ƒβ–…β–‚   ▁▁▁▁▁       ▁▁                               ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–‡β–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–†β–‡β–†β–„β–†β–†β–†β–…β–†β–„β–…β–„β–„β–„β–ƒβ–…β–„β–…β–†β–†β–…β–…β–„β–ƒβ–ƒ β–ˆ
  134 ΞΌs        Histogram: log(frequency) by time       208 ΞΌs <

 Memory estimate: 119.70 KiB, allocs estimate: 15.

julia> versioninfo()
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 Γ— Intel(R) Core(TM) i5-10500 CPU @ 3.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 12 on 12 virtual cores

Maybe you can try profiling the code on your system to see what’s taking up the time.

I could replicate my results on two more Windows 10 machines:

version info β€” Windows #1
Julia Version 1.8.2
Commit 36034abf26 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 12 Γ— 11th Gen Intel(R) Core(TM) i5-11500T @ 1.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, rocketlake)
  Threads: 1 on 12 virtual cores
version info β€” Windows #2
Julia Version 1.8.2
Commit 36034abf26 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 4 Γ— Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 4 on 4 virtual cores
Environment:
  JULIA_NUM_THREADS = 4

However, I do not encounter this issue and get the same results as you do when testing on macOS 10.14.6:

version info β€” macOS
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 4 Γ— Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, broadwell)
  Threads: 1 on 4 virtual cores

I tried profiling the code, but I do not see the source of the problem. I am enclosing the profiling results below.

Profiling eigvals(F) on Windows
Overhead β•Ž [+additional indent] Count File:Line; Function
=========================================================
   β•Ž673 @Base\client.jl:522; _start()
   β•Ž 673 @Base\client.jl:318; exec_options(opts::Base.JLOptions)
   β•Ž  673 @Base\client.jl:404; run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
   β•Ž   673 @Base\essentials.jl:726; invokelatest
   β•Ž    673 @Base\essentials.jl:729; #invokelatest#2
   β•Ž     673 @Base\client.jl:419; (::Base.var"#967#969"{Bool, Bool, Bool})(REPL::Module)
   β•Ž    β•Ž 673 C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:355; run_repl(repl::REPL.AbstractREPL, consumer::Any)
   β•Ž    β•Ž  673 C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:369; run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
   β•Ž    β•Ž   673 C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:232; start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
   β•Ž    β•Ž    673 ...uildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:247; repl_backend_loop(backend::REPL.REPLBackend)
   β•Ž    β•Ž     673 ...uildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:151; eval_user_input(ast::Any, backend::REPL.REPLBackend)
   β•Ž    β•Ž    β•Ž 673 @Base\boot.jl:368; eval
   β•Ž    β•Ž    β•Ž  673 ...t\worker\package_win64\build\usr\share\julia\stdlib\v1.8\Profile\src\Profile.jl:27; top-level scope
   β•Ž    β•Ž    β•Ž   673 REPL[21]:1; macro expansion
   β•Ž    β•Ž    β•Ž    673 ...er\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:335; eigvals(A::Matrix{Float64})
   β•Ž    β•Ž    β•Ž     673 ...er\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:335; #eigvals#101
   β•Ž    β•Ž    β•Ž    β•Ž 5   ...ge_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\LinearAlgebra.jl:377; copy_oftype
   β•Ž    β•Ž    β•Ž    β•Ž  2   @Base\array.jl:346; copyto!
   β•Ž    β•Ž    β•Ž    β•Ž   2   @Base\array.jl:322; copyto!
   β•Ž    β•Ž    β•Ž    β•Ž    2   @Base\array.jl:331; _copyto_impl!(dest::Matrix{Float64}, doffs::Int64, src::Matrix{Float64}, soffs::Int64, n::Int64)
  2β•Ž    β•Ž    β•Ž    β•Ž     2   @Base\array.jl:289; unsafe_copyto!
   β•Ž    β•Ž    β•Ž    β•Ž  3   @Base\array.jl:376; similar
  3β•Ž    β•Ž    β•Ž    β•Ž   3   @Base\boot.jl:461; Array
   β•Ž    β•Ž    β•Ž    β•Ž 668 ...er\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:300; eigvals!
   β•Ž    β•Ž    β•Ž    β•Ž  668 ...r\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:301; eigvals!(A::Matrix{Float64}; permute::Bool, scale::Bool, sortby::typeof(LinearAlgebra.eigsortby))
   β•Ž    β•Ž    β•Ž    β•Ž   2   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\generic.jl:1175; issymmetric
  1β•Ž    β•Ž    β•Ž    β•Ž    1   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\generic.jl:1246; ishermitian(A::Matrix{Float64})
   β•Ž    β•Ž    β•Ž    β•Ž    1   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\generic.jl:1249; ishermitian(A::Matrix{Float64})
  1β•Ž    β•Ž    β•Ž    β•Ž     1   @Base\range.jl:883; iterate
   β•Ž    β•Ž    β•Ž    β•Ž   666 ...win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetriceigen.jl:65; eigvals!
   β•Ž    β•Ž    β•Ž    β•Ž    666 ..._win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetriceigen.jl:66; #eigvals!#211
664β•Ž    β•Ž    β•Ž    β•Ž     665 ...package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5102; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{Float64}, vl::Float64, vu::Float64, il::Int64, iu::Int64, ab...
   β•Ž    β•Ž    β•Ž    β•Ž     1   ...package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5118; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{Float64}, vl::Float64, vu::Float64, il::Int64, iu::Int64, ab...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   @Base\array.jl:1236; resize!
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   @Base\array.jl:1011; _growend!
Total snapshots: 682. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task
Profiling eigvals(C) on Windows
Overhead β•Ž [+additional indent] Count File:Line; Function
=========================================================
   β•Ž137 @Base\client.jl:522; _start()
   β•Ž 137 @Base\client.jl:318; exec_options(opts::Base.JLOptions)
   β•Ž  137 @Base\client.jl:404; run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
   β•Ž   137 @Base\essentials.jl:726; invokelatest
   β•Ž    137 @Base\essentials.jl:729; #invokelatest#2
   β•Ž     137 @Base\client.jl:419; (::Base.var"#967#969"{Bool, Bool, Bool})(REPL::Module)
   β•Ž    β•Ž 137 C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:355; run_repl(repl::REPL.AbstractREPL, consumer::Any)
   β•Ž    β•Ž  137 C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:369; run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
   β•Ž    β•Ž   137 C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:232; start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
   β•Ž    β•Ž    137 ...uildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:247; repl_backend_loop(backend::REPL.REPLBackend)
   β•Ž    β•Ž     136 ...uildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:151; eval_user_input(ast::Any, backend::REPL.REPLBackend)
   β•Ž    β•Ž    β•Ž 136 @Base\boot.jl:368; eval
   β•Ž    β•Ž    β•Ž  136 ...t\worker\package_win64\build\usr\share\julia\stdlib\v1.8\Profile\src\Profile.jl:27; top-level scope
   β•Ž    β•Ž    β•Ž   136 REPL[27]:1; macro expansion
   β•Ž    β•Ž    β•Ž    136 ...er\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:335; eigvals(A::Matrix{ComplexF64})
   β•Ž    β•Ž    β•Ž     136 ...er\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:335; #eigvals#101
   β•Ž    β•Ž    β•Ž    β•Ž 12  ...ge_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\LinearAlgebra.jl:377; copy_oftype
   β•Ž    β•Ž    β•Ž    β•Ž  6   @Base\array.jl:346; copyto!
   β•Ž    β•Ž    β•Ž    β•Ž   6   @Base\array.jl:322; copyto!
   β•Ž    β•Ž    β•Ž    β•Ž    6   @Base\array.jl:331; _copyto_impl!(dest::Matrix{ComplexF64}, doffs::Int64, src::Matrix{ComplexF64}, soffs::Int64, n::Int64)
  6β•Ž    β•Ž    β•Ž    β•Ž     6   @Base\array.jl:289; unsafe_copyto!
   β•Ž    β•Ž    β•Ž    β•Ž  6   @Base\array.jl:376; similar
  6β•Ž    β•Ž    β•Ž    β•Ž   6   @Base\boot.jl:461; Array
   β•Ž    β•Ž    β•Ž    β•Ž 124 ...er\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:305; eigvals!
   β•Ž    β•Ž    β•Ž    β•Ž  124 ...r\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:306; eigvals!(A::Matrix{ComplexF64}; permute::Bool, scale::Bool, sortby::typeof(LinearAlgebra.eigsortby))
   β•Ž    β•Ž    β•Ž    β•Ž   1   ...r\package_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\eigen.jl:140; sorteig!
   β•Ž    β•Ž    β•Ž    β•Ž    1   @Base\sort.jl:704; sort!##kw
   β•Ž    β•Ž    β•Ž    β•Ž     1   @Base\sort.jl:722; #sort!#8
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   @Base\sort.jl:661; sort!
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   @Base\sort.jl:572; sort!(v::Vector{Float64}, lo::Int64, hi::Int64, a::Base.Sort.QuickSortAlg, o::Base.Order.By{typeof(LinearAlgebra....
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   1   @Base\sort.jl:556; partition!(v::Vector{Float64}, lo::Int64, hi::Int64, o::Base.Order.By{typeof(LinearAlgebra.eigsortby), Base.Orde...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    1   @Base\ordering.jl:119; lt
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž     1   @Base\ordering.jl:117; lt
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   @Base\float.jl:427; isless
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   @Base\float.jl:421; _fpint
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   1   @Base\int.jl:366; xor
   β•Ž    β•Ž    β•Ž    β•Ž   123 ..._win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetriceigen.jl:71; eigvals
   β•Ž    β•Ž    β•Ž    β•Ž    123 ..._win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetriceigen.jl:74; #eigvals#212
   β•Ž    β•Ž    β•Ž    β•Ž     11  ...kage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetric.jl:284; copy
 11β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 11  @Base\array.jl:369; copy
   β•Ž    β•Ž    β•Ž    β•Ž     112 ...win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetriceigen.jl:65; eigvals!##kw
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 112 ...win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetriceigen.jl:66; #eigvals!#211
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5245; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float64, vu::Float64, il::Int64, iu::Int6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   1   @Base\abstractarray.jl:797; similar
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    1   @Base\array.jl:378; similar
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž     1   @Base\boot.jl:468; Array
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   @Base\boot.jl:459; Array
103β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  104 ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5254; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float64, vu::Float64, il::Int64, iu::Int6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5272; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float64, vu::Float64, il::Int64, iu::Int6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   1   @Base\array.jl:1236; resize!
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    1   @Base\array.jl:1011; _growend!
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  4   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5274; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float64, vu::Float64, il::Int64, iu::Int6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   4   @Base\array.jl:1236; resize!
  4β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    4   @Base\array.jl:1011; _growend!
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  2   ...ackage_win64\build\usr\share\julia\stdlib\v1.8\LinearAlgebra\src\lapack.jl:5276; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float64, vu::Float64, il::Int64, iu::Int6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   2   @Base\array.jl:1236; resize!
  2β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    2   @Base\array.jl:1011; _growend!
  1β•Ž    β•Ž     1   ...uildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.8\REPL\src\REPL.jl:154; eval_user_input(ast::Any, backend::REPL.REPLBackend)
Total snapshots: 150. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task
Profiling eigvals(F) on macOS
Overhead β•Ž [+additional indent] Count File:Line; Function
=========================================================
   β•Ž62  @Base/client.jl:522; _start()
   β•Ž 62  @Base/client.jl:318; exec_options(opts::Base.JLOptions)
   β•Ž  62  @Base/client.jl:404; run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set:...
   β•Ž   62  @Base/essentials.jl:726; invokelatest
   β•Ž    62  @Base/essentials.jl:729; #invokelatest#2
   β•Ž     62  @Base/client.jl:419; (::Base.var"#967#969"{Bool, Bool, Bool})(REPL::Module)
   β•Ž    β•Ž 62  ...ease-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:355; run_repl(repl::REPL.AbstractREPL, consumer::Any)
   β•Ž    β•Ž  62  ...ease-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:369; run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
   β•Ž    β•Ž   62  ...ase-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:232; start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
   β•Ž    β•Ž    62  ...ase-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:247; repl_backend_loop(backend::REPL.REPLBackend)
   β•Ž    β•Ž     62  ...se-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:151; eval_user_input(ast::Any, backend::REPL.REPLBackend)
   β•Ž    β•Ž    β•Ž 62  @Base/boot.jl:368; eval
   β•Ž    β•Ž    β•Ž  62  ...ot-8/usr/share/julia/stdlib/v1.8/Profile/src/Profile.jl:27; top-level scope
   β•Ž    β•Ž    β•Ž   62  REPL[20]:1; macro expansion
   β•Ž    β•Ž    β•Ž    62  ...usr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:335; eigvals(A::Matrix{Float64})
   β•Ž    β•Ž    β•Ž     62  ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:335; #eigvals#101
   β•Ž    β•Ž    β•Ž    β•Ž 2   .../julia/stdlib/v1.8/LinearAlgebra/src/LinearAlgebra.jl:377; copy_oftype
   β•Ž    β•Ž    β•Ž    β•Ž  2   @Base/array.jl:376; similar
  2β•Ž    β•Ž    β•Ž    β•Ž   2   @Base/boot.jl:461; Array
   β•Ž    β•Ž    β•Ž    β•Ž 60  ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:300; eigvals!
   β•Ž    β•Ž    β•Ž    β•Ž  60  ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:301; eigvals!(A::Matrix{Float64}; permute::Bool, scale::Bool, sortby::typeof(LinearAlg...
   β•Ž    β•Ž    β•Ž    β•Ž   2   ...share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:1175; issymmetric
   β•Ž    β•Ž    β•Ž    β•Ž    1   ...hare/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:1246; ishermitian(A::Matrix{Float64})
  1β•Ž    β•Ž    β•Ž    β•Ž     1   @Base/float.jl:411; !=
   β•Ž    β•Ž    β•Ž    β•Ž    1   ...hare/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:1249; ishermitian(A::Matrix{Float64})
   β•Ž    β•Ž    β•Ž    β•Ž     1   @Base/range.jl:883; iterate
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   @Base/promotion.jl:477; ==
   β•Ž    β•Ž    β•Ž    β•Ž   58  ...ulia/stdlib/v1.8/LinearAlgebra/src/symmetriceigen.jl:65; eigvals!
   β•Ž    β•Ž    β•Ž    β•Ž    58  ...ulia/stdlib/v1.8/LinearAlgebra/src/symmetriceigen.jl:66; #eigvals!#211
   β•Ž    β•Ž    β•Ž    β•Ž     1   ...share/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5079; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{Float64}, vl::Float64, v...
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   ...share/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:101; chkuplofinite(A::Matrix{Float64}, uplo::Char)
 50β•Ž    β•Ž    β•Ž    β•Ž     50  ...share/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5102; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{Float64}, vl::Float64, v...
   β•Ž    β•Ž    β•Ž    β•Ž     6   ...share/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5120; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{Float64}, vl::Float64, v...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 6   @Base/array.jl:1236; resize!
  6β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  6   @Base/array.jl:1011; _growend!
  1β•Ž    β•Ž    β•Ž    β•Ž     1   ...share/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5122; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{Float64}, vl::Float64, v...
Total snapshots: 194. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task
Profiling eigvals(C) on macOS
Overhead β•Ž [+additional indent] Count File:Line; Function
=========================================================
   β•Ž93  @Base/client.jl:522; _start()
   β•Ž 93  @Base/client.jl:318; exec_options(opts::Base.JLOptions)
   β•Ž  93  @Base/client.jl:404; run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set:...
   β•Ž   93  @Base/essentials.jl:726; invokelatest
   β•Ž    93  @Base/essentials.jl:729; #invokelatest#2
   β•Ž     93  @Base/client.jl:419; (::Base.var"#967#969"{Bool, Bool, Bool})(REPL::Module)
   β•Ž    β•Ž 93  ...ease-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:355; run_repl(repl::REPL.AbstractREPL, consumer::Any)
   β•Ž    β•Ž  93  ...ease-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:369; run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
   β•Ž    β•Ž   93  ...ase-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:232; start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
   β•Ž    β•Ž    93  ...ase-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:247; repl_backend_loop(backend::REPL.REPLBackend)
   β•Ž    β•Ž     93  ...se-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:151; eval_user_input(ast::Any, backend::REPL.REPLBackend)
   β•Ž    β•Ž    β•Ž 93  @Base/boot.jl:368; eval
   β•Ž    β•Ž    β•Ž  93  ...ot-8/usr/share/julia/stdlib/v1.8/Profile/src/Profile.jl:27; top-level scope
  2β•Ž    β•Ž    β•Ž   93  REPL[24]:1; macro expansion
   β•Ž    β•Ž    β•Ž    90  ...usr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:335; eigvals(A::Matrix{ComplexF64})
   β•Ž    β•Ž    β•Ž     90  ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:335; #eigvals#101
   β•Ž    β•Ž    β•Ž    β•Ž 5   .../julia/stdlib/v1.8/LinearAlgebra/src/LinearAlgebra.jl:377; copy_oftype
   β•Ž    β•Ž    β•Ž    β•Ž  5   @Base/array.jl:376; similar
  5β•Ž    β•Ž    β•Ž    β•Ž   5   @Base/boot.jl:461; Array
   β•Ž    β•Ž    β•Ž    β•Ž 85  ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:305; eigvals!
  1β•Ž    β•Ž    β•Ž    β•Ž  1   ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:305; eigvals!(A::Matrix{ComplexF64}; permute::Bool, scale::Bool, sortby::typeof(Linear...
   β•Ž    β•Ž    β•Ž    β•Ž  84  ...sr/share/julia/stdlib/v1.8/LinearAlgebra/src/eigen.jl:306; eigvals!(A::Matrix{ComplexF64}; permute::Bool, scale::Bool, sortby::typeof(Linear...
  2β•Ž    β•Ž    β•Ž    β•Ž   4   ...share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:1246; ishermitian(A::Matrix{ComplexF64})
  2β•Ž    β•Ž    β•Ž    β•Ž    2   @Base/array.jl:925; getindex
  1β•Ž    β•Ž    β•Ž    β•Ž   1   ...share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:1249; ishermitian(A::Matrix{ComplexF64})
   β•Ž    β•Ž    β•Ž    β•Ž   79  ...ulia/stdlib/v1.8/LinearAlgebra/src/symmetriceigen.jl:71; eigvals
   β•Ž    β•Ž    β•Ž    β•Ž    79  ...ulia/stdlib/v1.8/LinearAlgebra/src/symmetriceigen.jl:74; #eigvals#212
   β•Ž    β•Ž    β•Ž    β•Ž     3   ...re/julia/stdlib/v1.8/LinearAlgebra/src/symmetric.jl:284; copy
  3β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 3   @Base/array.jl:369; copy
   β•Ž    β•Ž    β•Ž    β•Ž     76  ...lia/stdlib/v1.8/LinearAlgebra/src/symmetriceigen.jl:65; eigvals!##kw
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 76  ...lia/stdlib/v1.8/LinearAlgebra/src/symmetriceigen.jl:66; #eigvals!#211
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   ...hare/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5245; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   1   @Base/abstractarray.jl:797; similar
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    1   @Base/array.jl:378; similar
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž     1   @Base/boot.jl:468; Array
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    β•Ž 1   @Base/boot.jl:459; Array
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  1   ...hare/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5248; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float6...
  1β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   1   @Base/boot.jl:459; Array
 72β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  72  ...hare/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5254; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž  2   ...hare/julia/stdlib/v1.8/LinearAlgebra/src/lapack.jl:5276; syevr!(jobz::Char, range::Char, uplo::Char, A::Matrix{ComplexF64}, vl::Float6...
   β•Ž    β•Ž    β•Ž    β•Ž    β•Ž   2   @Base/array.jl:1236; resize!
  2β•Ž    β•Ž    β•Ž    β•Ž    β•Ž    2   @Base/array.jl:1011; _growend!
Total snapshots: 265. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task

I also tried experimenting with the number of threads, but the results did not change.

That does appear to be a Windows specific issue. I don’t have a Windows system available. I’m reasonably comfortable with LAPACK code, which sometimes handles smaller matrices using less optimized code. I was initially thinking there might be something like that going on that impacted the real case specifically. I don’t see anything obviously like that in LAPACK and, in any event, I’d expect that sort of issue would be more likely to have the same impact on different operating systems.

Since you have verified on several Windows machines I’d suppose that it might already be appropriate to file an issue. This doesn’t seem like something that should happen.

If you want to investigate more deeply: From the profiling, it looks like the time is taken up in syevr!, which is a thin wrapper for setting up a call to the LAPACK subroutines dsyevr and zheevr. So my next guess is that there is some idiosyncratic difference in the LAPACK/BLAS libraries distributed with the Windows version of Julia.

So I would check if the issue persists using MKL.jl. If it goes away it seems likely to be something specific to the default LAPACK/OpenBLAS libraries that are provided with the Windows version of Julia. Maybe this solves your problem, but filing an issue still seems like a helpful thing to do.

To pin it down more, you would probably need to dig deeper into the LAPACK code. Both dsyevr and zheevr do a tridiagonal reduction and then consider various cases to decide how to compute eigenvalues. The RRR algorithm is used where appropriate, but giving the LAPACK code a quick look, I think if you want to get all eigenvalues and none of the eigenvectors, it will call a QR algorithm variant in dsterf. After checking MKL, I’d set up my own calls to dsyevr and zheevr and benchmark those. Assuming the difference persists, I’d then try benchmarking the routines called by dsyevr to see if the difference is in the tridiagonal reduction or the final computation of eigenvalues. LAPACK supports calls that give optimal workspace sizes. It might also be good to look into any differences between Windows and MacOS/Linux on those. I’d also double check that LAPACK thinks that your machine is IEEE-754 compliant. (It uses a different routine for eigenvalues if the answer is β€œno”.) It could be a deep rabbit hole to go down.

1 Like

With MKL.jl this works on Windows as expected:

Indeed, it is probably an issue with the default LAPACK/OpenBLAS shipped with Julia on Windows.

Thank you for the suggestion on how to investigate further. I’ve just opened an issue:
eigvals performs faster for Matrix{ComplexF64} than Matrix{Float64} Β· Issue #47211 Β· JuliaLang/julia (github.com)
With your suggestion, maybe someone will be able to quickly get to the core of the problem.

2 Likes