AppleAccelerate.jl v0.4.0

viralbshah · May 24, 2023, 9:18pm

Hello all,

For the longest time, we haven’t been able to leverage the BLAS and LAPACK in Apple’s Accelerate largely because it was LP64 only and carried a really old version of LAPACK. This didn’t matter much on Intel macs, because one could use OpenBLAS which is quite good and also MKL.

Of course, with Apple Silicon (which is now Tier-1), everything changed. Accelerate can offer much higher performance than OpenBLAS (at least as of right now). Listening to us (@staticfloat mostly), Apple introduced ILP64 as well as a modern LAPACK in macOS 13.3. @staticfloat updated AppleAccelerate.jl so that it could become a libblastrampoline backend. Upon doing that, we found an issue in pivoted Cholesky, which is fixed in macOS 13.4. Interestingly, using LBT’s overlay mechanism, we could actually override the buggy Accelerate version with one in LAPACK_jll.

As a result, AppleAccelerate.jl 0.4.0 was finally tagged and released. The performance difference is significant for certain matmuls on Apple Silicon:

julia> peakflops(4096) # OpenBLAS
3.6024175318268243e11

julia> using AppleAccelerate

julia> peakflops(4096)
5.832806459434183e11

It also works on Intel macs giving marginally better performance in some cases. Naturally all of this needs a lot more testing and experimenting, so please try it out.

-viral

e3c6 · May 24, 2023, 9:23pm

Should this be the default BLAS backend on mac’s with Apple Silicon?

viralbshah · May 24, 2023, 9:26pm

Perhaps eventually. Right now it is only useful on Apple Silicon + macOS 13.4. Making it the default would need extensive testing on the package ecosystem. Certainly possible, but this is a first step.

ctkelley · May 25, 2023, 11:59am

The performance is not uniformly better with Accelerate. The discussion of 42312 has some details. It seems too early to make it the default.

viralbshah · May 25, 2023, 12:27pm

Of course there is also the issue of splitting our support resources, and there is some value to using the same codebase on all platforms.

e3c6 · May 27, 2023, 11:29am

If I understand currently this triggers libblastrampoline to use the Apple Accelerate backend. Can I go back to OpenBLAS without restarting Julia?

viralbshah · June 7, 2023, 3:52pm

Yes the LBT API will let you do that, since this is essentially just updating a table of pointers. I suppose it would be nice to have an API to do that in LinearAlgebra.jl.

carstenbauer · June 7, 2023, 5:07pm

See also the related issue here: Switch between MKL and OpenBLAS at runtime · Issue #90 · JuliaLinearAlgebra/MKL.jl · GitHub

Topic		Replies	Views
Blas version on MacOS Internals & Design	2	1355	June 7, 2018
Any good OpenCL examples to demonstrate a speedup? GPU	4	1384	April 3, 2019
Apple M1 GPU from Julia? GPU question	20	5862	March 31, 2023
What is the current state of multi-threaded BLAS in Julia? Performance blas	8	1376	March 23, 2024
Current OpenBLAS Versions (January 2022) do not support Intel gen 11 performantly? Performance linearalgebra	50	4605	April 7, 2022

AppleAccelerate.jl v0.4.0

Related topics