Is there anything users can do to help move Apple Silicon support from Tier 3 to Tier 1?

ImreSamu · December 9, 2021, 9:46am

I have found an Alternative for testing M1 ( for the package developers )

Scaleway: Apple silicon M1 as-a-Service. Cloud Mac | Scaleway

minimum 1day cost = 2.4 EUR ( 8GB RAM, M1 )
- Availability Zone = PARIS; ( EU )
- macOS Monterey 12
- €0.10/hour “As required by Apple License, you must keep this instance for at least 24 hours. You will be able to delete it only after 24 hours.”

For ad-hoc testing ( 1-2 days/month ) it is perfect for me.

it is also perfect for benchmarking special Julia code on AppleSilicon ( before investing in the hardware )

e3c6 · July 16, 2022, 10:32am

Doesn’t Apple have it’s own linear algebra libraries?

ctkelley · July 16, 2022, 12:18pm

They do, but it does not seem to be a trivial job to get it to work. See

https://github.com/JuliaLang/julia/issues/42312

jling · July 16, 2022, 2:24pm

FYI:

ctkelley · July 16, 2022, 4:35pm

Some good news is that Apple hardware is now Tier 2.

isaacsas · July 16, 2022, 6:21pm

Doesn’t that predate M1’s? Googling around gives threads stating that Numpy supports Accelerate again now and it does give a performance boost to build against it.

e3c6 · July 16, 2022, 7:31pm

Can you post some links?

isaacsas · July 16, 2022, 7:52pm

I just Google’d “ numpy accelerate m1”. A bunch of threads pop up on Reddit and Stackoverflow going back to November about building Numpy successfully against Accelerate/vecLib. (Note, the ones I skimmed were about NumPy though, so perhaps it still won’t work for SciPy.) Here is one example: https://www.reddit.com/r/Python/comments/qog8x3/if_you_are_using_apples_m1_macs_compiling_numpy/

robsmith11 · July 18, 2022, 1:23am

What about using something like GitHub - JuliaLinearAlgebra/Octavian.jl: Multi-threaded BLAS-like library that provides pure Julia matrix multiplication for BLAS on m1 macs? Has anyone tried it?

viralbshah · July 18, 2022, 1:29am

We should be able to switch to calling Accelerate through libblastrampoline. A little bit of work needs to be done on building LAPACK in Yggdrasil, appropriately patched for ILP64 and all that.

The main reason is that Accelerate uses an ancient LAPACK, and we use functions from recent versions - so when we use LBT to switch to using BLAS from Accelerate, we don’t want to use its LAPACK, and instead provide our own.

One downside is that OpenBLAS does patch LAPACK to provide multi-threaded versions of common LAPACK functions, that one would lose in the configuration I describe above.

Eventually, we hope to have native Julia solutions like Octavian for clean, high performance multi-threaded linear algebra kernels.

ctkelley · July 18, 2022, 10:39pm

This question may be above your ( and anyone else’s who does not work for Apple ) paygrade. I looked at the very limited documentation for Accelerate and saw no evidence that it supports Float16. Have you seen any such support? The reading I’ve done seems to say that Apple has not done anything to update Accelerate in many years.

The line in your post ``A little bit of work needs to be done’’ sounds encouraging for Float64 and Float32 work anyhow.

ctkelley · July 18, 2022, 10:51pm

I’ve done a little bit of Float64/32 testing (lu …) on an M1 Macbook of OpenBLAS vs @Elrod’s ApppleAcclerate.jl package. It seems to be as fast as or faster than threaded OpenBLAS without explicit threading. Of course, nobody outside of Apple knows what Accelerate really does, so it may use threads (or not).

LaurentPlagne · July 19, 2022, 7:03am

For gemm it looks that cpu (and gpu) cores are not used at all (see the perf monitor) and the cpu temperature stays super low (<50 degC without fan). The so-called AMX co-processor The Secret Apple M1 Coprocessor. Developer Dougall Johnson has through… | by Erik Engheim | The Startup | Medium is supposed to be in use…

One may try to saturate the use of the tensor unit with a constant AI load while evaluating Accelerate gemm to see if AMX is related to tensor unit.

viralbshah · July 19, 2022, 10:17pm

I just registered LAPACK_jll with LAPACK 3.10 and the right 64_ suffixes for 64-bit systems. So, the path to making Accelerate as easy to use as MKL.jl is, is to have LBT point to it for BLAS and to LAPACK_jll for LAPACK.

We should do this in either AppleAccelerate.jl perhaps and revive it.

Topic		Replies	Views
Apple silicon full power Performance hardware , apple	19	6809	November 18, 2021
Apple M1, M1 pro M1 Max and Julia developpers Offtopic	17	5480	November 1, 2021
Mac's AMD GPU GPU	61	10658	September 7, 2020
Apple M1 GPU from Julia? GPU question	20	5954	March 31, 2023
JuliaPro 1.0.1.1 is available, but no MKL? Tooling juliapro	7	3703	September 12, 2023

Is there anything users can do to help move Apple Silicon support from Tier 3 to Tier 1?

Related topics