Parallel computing on M1 Max?

With the release of 2021 macbook pro with M1 Pro/Max, I’m wondering if there is a good API/Libs for doing parallel computing on that chip?
From the reviews, the computational power of that device is insane, sometimes on par or better than a 3090 on some tasks, on a portable device like that. And you can have 64gig of gram! No CPU to GPU memcopy!

As far as I know, pure CPU code is running fine on M1 for julia with 1.7, maybe?
How about BLAS?
How about GPU code?
And that chip seems to have a GPU part and a tensor core like thing, how to utilize these?

There is no gpu support on mac because all gpus on mac are pretty weak. But right now and in the future, it may have many advantages over other device and could maybe eventually change the whole workflow for computation.

I haven’t buy it yet because I know I can’t do parallel computing with Julia on it yet. But really looking forward to it.

3 Likes

I had the same question in mind, thanks for asking.

Computing on CPU works fine, even multithreaded.
OpenBlas works fine and there is some effort, albeit small, to make use of Accelerate (apples proprietary BLAS).
GPUs aren’t supported yet, apparently there have been some talks with apple but not sure about anything yet.
The tensor core doesn’t have a public API that I know of, so the only way to use it is with apples proprietary ML software

1 Like

That’s not good. I hope some one takes a open processor like RISC-V and make its better than M1. And some good heart company like System76.com or Frame.work put a open system where software could be developed and run with openness.

This package may be of some interest here : GitHub - JuliaMath/AppleAccelerate.jl: Julia interface to OS X's Accelerate framework :wink:

These are just the elementwise function which don’t seem to be much faster than the normal julia ones. GitHub - chriselrod/AppleAccelerateLinAlgWrapper.jl: Simple experimental wrapper of small number of Apple Accelerate linear algebra routines using libblastrampoline. Largely used for benchmarking routines using the Apple M1's matrix instructions. is the one for some BLAS functions but it’s not really usable right now.

1 Like

Noted !
Apple, guided by their ambition to build a better world, will probably implement the oneAPI API and everything will run smooth :innocent: