OpenBLAS is faster than Intel MKL on AMD Hardware (Ryzen)

That might change with the new 7nm Ryzen, but

julia> using LinearAlgebra

julia> BLAS.set_num_threads(Sys.CPU_THREADS >> 1)

julia> LinearAlgebra.peakflops(16000)
1.9732209060275942e12

julia> LinearAlgebra.peakflops(16000)
1.9726505538599666e12

julia> versioninfo()
Julia Version 1.4.0-DEV.0
Commit 2ef0ed159d* (2019-08-17 17:21 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.0 (ORCJIT, skylake)

is going to be well ahead of a Ryzen 2990WX. Similarly, the 7900X (10 core, with avx512) was much faster for BLAS/LAPACK than the Ryzen 1950X (16 core, half-rate avx2).

The new 7 nm Ryzen half-full rate avx 2. I would bet on them over the Intel chips without avx512.
As for the HEDT parts with avx-512, I’ll wait to see the new Threadrippers. If AMD offers twice the cores per $, it’ll be hard to beat them there (and then AMD would be much faster for all non-avx workloads).

For the top of the line server parts, I believe the 64 core EPYC costs less money than the 28 core Xeon. So while they might be close for extremely vectorizable code that can leverage avx512 like matrix multiplication…

1 Like