Maximum BLAS threads number

Hi, consider the following example:

julia> versioninfo()
Julia Version 1.1.1
Commit 55e36cc308 (2019-05-16 04:10 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 24

julia> using LinearAlgebra

julia> BLAS.set_num_threads(24)

julia> ccall((:openblas_get_num_threads64_, Base.libblas_name), Cint, ())
16

julia> BLAS.set_num_threads(12)

julia> ccall((:openblas_get_num_threads64_, Base.libblas_name), Cint, ())
12

It won’t let me get more than 16 threads for BLAS. The computer has 2 CPUs, each of which has 12 cores (24 HT). How does the limitation 16 come from?

Thanks!

https://github.com/JuliaLinearAlgebra/OpenBLASBuilder/blob/5a6eca317de505abc3678e47f08f4355646f511e/build_tarballs.jl#L42-L47.

IIRC, OpenBLAS allocates memory on initialization for the largest number of threads it can start so it isn’t possible to put a super high number. I guess 16 was good at the time but with high core counts becoming more mainstream, perhaps it would make sense to increase it.

3 Likes

Thanks for the link!

Is it just as simple as changing the number in flags+=(NUM_THREADS=16) or is there some nontrivial modifications to make more use of threads for now?