OpenBLAS: Julia slower than R

How many physical cores does your system have?
OpenBLAS with Julia is capped at 16 threads.

Watching top, I saw that R used >2000% CPU, while Julia used 800%. The number of physical CPU cores was 16, so I set BLAS.set_num_threads(16) which gave about a 20% performance increase.

R took about 10% longer than Julia using 8 threads.
export OPENBLAS_NUM_THREADS=16 and then launching R in the exact same terminal tab did not reduce the CPU use by R, so I do not think that worked.

Because of the cap on Julia’s OpenBLAS, I could not actually test running both with the same number of threads.

Because practically 100% of the time spent is spent by OpenBLAS, I’d expect equal performance if both were actually using the same number of threads.

I could edit that file and recompile for the sake of testing this, but given that my expectation is it will run slower (when set to using that many threads), I’m not exactly excited.

1 Like