BLAS fails in Julia's multithreaded mode with too many threads

I am using a supercomputer with 80 cores that is running Scientific Linux. If I start Julia in multithreaded mode with 17 or fewer threads and run the following code, everything works fine. However, if I use 18 or more threads, I get segfault errors from BLAS. Interestingly, everything works fine if b is a vector, or if one of a or b is a sparse matrix. Is this an error with BLAS, or am I doing something wrong?

a = rand(10, 10); b = rand(10, 8); c = zeros(1000);
Threads.@threads for i = 1:1000
c[i] = sum(a*b);
end

Unless you built Julia without OpenBLAS, consider this FAQ from the OpenBLAS wiki:

If your application is already multi-threaded, it will conflict with OpenBLAS multi-threading. Thus, you must set OpenBLAS to use single thread

BLAS.set_num_threads(1) for Julia v0.5.

I have already set BLAS.set_num_threads(1) prior to running the
multithreaded for loop. The following code still fails:

BLAS.set_num_threads(1)
a = rand(10, 10); b = rand(10, 8); c = zeros(1000);
Threads.@threads for i = 1:1000
c[i] = sum(a*b);
end

I think this is https://github.com/JuliaLang/julia/issues/14857. AFAIK, the only solution is to build your own Julia with the right setting for OpenBLAS threads/buffers.

OP: consider posting the actual log, when you’re reporting a crash.

@jameson, @tkelman: more people are going to run into this as larger core counts become more common (or more people who have access to larger core counts start to use Julia). :slight_smile:

Thank you very much!