Doesn’t that predate M1’s? Googling around gives threads stating that Numpy supports Accelerate again now and it does give a performance boost to build against it.
I just Google’d “ numpy accelerate m1”. A bunch of threads pop up on Reddit and Stackoverflow going back to November about building Numpy successfully against Accelerate/vecLib. (Note, the ones I skimmed were about NumPy though, so perhaps it still won’t work for SciPy.) Here is one example: https://www.reddit.com/r/Python/comments/qog8x3/if_you_are_using_apples_m1_macs_compiling_numpy/
We should be able to switch to calling Accelerate through libblastrampoline. A little bit of work needs to be done on building LAPACK in Yggdrasil, appropriately patched for ILP64 and all that.
The main reason is that Accelerate uses an ancient LAPACK, and we use functions from recent versions - so when we use LBT to switch to using BLAS from Accelerate, we don’t want to use its LAPACK, and instead provide our own.
This question may be above your ( and anyone else’s who does not work for Apple ) paygrade. I looked at the very limited documentation for Accelerate and saw no evidence that it supports Float16. Have you seen any such support? The reading I’ve done seems to say that Apple has not done anything to update Accelerate in many years.
The line in your post ``A little bit of work needs to be done’’ sounds encouraging for Float64 and Float32 work anyhow.
I’ve done a little bit of Float64/32 testing (lu …) on an M1 Macbook of OpenBLAS vs @Elrod’s ApppleAcclerate.jl package. It seems to be as fast as or faster than threaded OpenBLAS without explicit threading. Of course, nobody outside of Apple knows what Accelerate really does, so it may use threads (or not).
I just registered LAPACK_jll with LAPACK 3.10 and the right 64_ suffixes for 64-bit systems. So, the path to making Accelerate as easy to use as MKL.jl is, is to have LBT point to it for BLAS and to LAPACK_jll for LAPACK.
We should do this in either AppleAccelerate.jl perhaps and revive it.