Some of us use Intel MKL with Julia for improved performance.
Intel MKL is composed of few code paths for different features of the CPU (SSE2, SSE4, AVX2, AVX512, etc…).
One of the issues of MKL is it discriminate non Intel CPU’s and use the the generic SSE2 code path on AVX2 capable CPU’s.
This specifically hurts on Ryzen 3xxx series which have better AVX2 performance than Intel’s comparable CPU’s.
It seems people found a way around it. By defining System / Environment Variable users could enforce Intel MKL to use the AVX2 code path and skip the CPU Dispatching mechanism.
Though the above targets MATLAB I think it should work on Julia + MKL.
In Windows it requires:
@echo off
set MKL_DEBUG_CPU_TYPE=5
matlab.exe
It seems MKL_DEBUG_CPU_TYPE=5 suggests AVX2 capable CPU code path.
Where instead of launching MATLAB one should launch Julia.
The same should hols on other OS.
I wonder if one could integrate this trick into Juno (On Julia Pro for that matter).
Admittign my ignorance here. I thought one had to do a compile and link of Julai from source in order to use Intel MKL.
I am sure an expert will be along soon to correct me…
" Intel MKL has been known to use a SSE code paths on AMD CPUs that support newer SIMD instructions such as those that use the Zen microarchitecture. A (by now) well-known trick has been to set the MKL_DEBUG_CPU_TYPE environment variable to the value 5 to force the use of AVX2 kernels on AMD Zen CPUs. Unfortunately, this variable has been removed from Intel MKL 2020 Update 1 and later. This can be confirmed easily by running a program that uses MKL with ltrace -e getenv ."
wow, more active hostility from Intel, who would have thought. /s
edit: nvm, the tone spoofed me. But again, removing option before implementing everything in BLAS for Zen is still bad when removing an *existing solution
Have there been any developments on this lately? I’m asking because I’m considering buying a Ryzen computer, but since Julia is such a big part of my work I won’t do it if I know the performance is going to be worse than an Intel one.
Yes. LoopVectorization.jl, TriangularSolve.jl, RecursiveFactorization.jl, and Octavian.jl are all very optimized on my Ryzen 5950x and outperform MKL on it. So that’s at least what SciML defaults to under the hood now. Since the pure-Julia BLAS tools are good enough this issue is effectively nullified. (Though note they do not have full coverage of BLAS/LAPACK though)
If you’re using Linux you may trick MKL and get Intel code path on your Ryzen which will be the best you’ll be able to get from MKL.
I’m pretty sure that on new MKL the discrimination will stop probably by Intel itself.
Anyhow, Buy Ryzen, Nothing form Intel will beat Ryzen 5950x / Ryzen 5900x unless you go Intel HEDT.