svdvals is alarmingly slow

@simonbyrne I had a chance to try it, and the results are not good:

julia> @btime svdvals($B);
  1.371 ms (11 allocations: 138.20 KiB)

julia> versioninfo()
Julia Version 0.6.2
Commit d386e40c17* (2017-12-13 18:08 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

This is the same machine where Julia on a Linux VirtualBox takes 673us.

One more thing, on Linux I build Julia from source, while on Windows I’m using the precompiled binary. I don’t expect this to cause much difference (but you never know).

Thanks: so slower, but not alarmingly so.

Anyway, thanks to some detective work from Andreas and the OpenBLAS devs, we think we might have figured out the root problem: https://github.com/xianyi/OpenBLAS/issues/1470#issuecomment-368140401

1 Like

Wow, it kind of blows my mind that the OpenBLAS repo is under the github of just-some-guy, and not like a collaboration or something. Poor guy.

2 Likes

… and more worrying, one that

but xianyi has not been seen here for quite some time, and I still believe only he can do releases

I heard he is devoted to PerfBLAS now
:grinning:

Oh man, that’s horrifying but also absolutely hilarious :laughing:.

I really, really don’t understand why Intel won’t just open source MKL, what is the down side for them?

I thought that the Julia devs forked one of the BLAS and had there own customized one or something like that? Am I thinking of the wrong library?

1 Like

Some people buy Intel to take advantage of their software.

So the downside is that if MKL becomes open source, nothing would stop it from having incredible performance on AMD hardware – just like it does on Intel.

Worth pointing out that Intel’s Clear Linux distribution was the the fastest distribution on Ryzen processors according to a variety of benchmarks.

Proprietary MKL and Intel’s compilers definitely aren’t slow on AMD, but they’re not the fastest either.

1 Like

Julia uses a patched LLVM.

1 Like

Well, I believe that Intel CPU’s are faster than AMD CPU’s not only because of the CPU’s but because of the Software Eco System.
Year of discriminating optimizations (Well, at some point AMD wasn’t in the game so it is logical).
Intel MKL, Intl IPP Intel SVML and Intel Compiler have a lot to do with it.

You can see above, I think that for SVD there is no reason for Ryzen, CPU Wise, not to be faster than Intel’s price comparable CPU.
Yet It is (Far?) behind, just because of MKL not optimized (When I say not optimized it means it will chose SSE2 Code path instead of AVX 2) for Ryzen.

Having libraries as fast as those of Intel yet without a CPU discrimination will be a revolution.

Perhaps you were thinking of OpenLibm?

Quite possibly. Actually I would have thought that branching BLAS would be a more generally useful thing to do than branching libm, not that I know what I’m talking about. I’m surprised that the GNU version of libm is not sufficient. Does Julia compile with OpenLibm by default?

julia> using BenchmarkTools

julia> B=randn(100,100);

julia> @btime svdvals($B);
  1.325 ms (9 allocations: 138.17 KiB)

julia> versioninfo()
Julia Version 0.7.0-DEV.4389
Commit d6edb86bc0* (2018-02-25 21:29 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-6650U CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, skylake)
Environment: