It seems unnecessary to provide free cloud computing time on very expensive hardware which is Tier 3 and on which most non trivial workloads currently seem to segfault as Mose says?
Would it be great to have free Apple Silicon CI available for the entire ecosystem? Sure.
Will we have it at some point? Maybe.
Is this the reason why Julia is Tier 3? Absolutely not.
The way I see it, the Tier levels have hardly anything to do with the (general) ecosystem. They make a statement about Julia itself.
I think this discussion got a bit off track: When we are talking about CI on M1, and renting it in the cloud, or buying some etc. then I think this can only applies to making that CI hardware available to the CI of the Julia project itself, and thus in the end core devs.
I do not think the Julia project itself can afford to provide M1 CI instances to arbitrary packages. Do you expect them to create an alternative to GitHub Actions, Travis, AppVeyor, CircleCI, etc. ??? No, if you want arbitrary packages to have M1 CI, you’ll have to hope that some of those providers will eventually offer them.
Thus, when it comes to making M1 Tier 2 or even Tier 1, while of course having consistent CI for Julia itself is kinda essential for that, first the already known serious problems ought to be resolved; there is no point setting up a CI for something that will just crash all the time. Once these fix are in place, I am sure there’ll be a way to get M1 builders for the Julia CI. But providing this for the wider package ecosystem is not something the Julia team can take care of.
On the upside, waiting increases the chance that e.g. GitHub Actions will just make something like this available. However, AFAIK GitHub Actions are on Microsoft Azure and I have no idea when, if ever, they will provide M1 support. On the upside, AWS EC2 M1 Mac instances are in preview; once they are generally available, I am hopeful some CI providers using AWS will eventually offer M1 instances.
It’s not: to unblock the situation, already known issues have to be fixed. CI is really a minor problem at this point. Also, I had already pointed out a side approach to make more likely packages will work on the M1: test them on aarch64 Linux. It’s clearly not the same thing, but helps a lot for catching architecture-related issues, and I’ve personally caught some this way. It doesn’t help much with platform-specific issues (like the well known segfault).
Doesn’t that predate M1’s? Googling around gives threads stating that Numpy supports Accelerate again now and it does give a performance boost to build against it.
I just Google’d “ numpy accelerate m1”. A bunch of threads pop up on Reddit and Stackoverflow going back to November about building Numpy successfully against Accelerate/vecLib. (Note, the ones I skimmed were about NumPy though, so perhaps it still won’t work for SciPy.) Here is one example: https://www.reddit.com/r/Python/comments/qog8x3/if_you_are_using_apples_m1_macs_compiling_numpy/
We should be able to switch to calling Accelerate through libblastrampoline. A little bit of work needs to be done on building LAPACK in Yggdrasil, appropriately patched for ILP64 and all that.
The main reason is that Accelerate uses an ancient LAPACK, and we use functions from recent versions - so when we use LBT to switch to using BLAS from Accelerate, we don’t want to use its LAPACK, and instead provide our own.
This question may be above your ( and anyone else’s who does not work for Apple ) paygrade. I looked at the very limited documentation for Accelerate and saw no evidence that it supports Float16. Have you seen any such support? The reading I’ve done seems to say that Apple has not done anything to update Accelerate in many years.
The line in your post ``A little bit of work needs to be done’’ sounds encouraging for Float64 and Float32 work anyhow.
I’ve done a little bit of Float64/32 testing (lu …) on an M1 Macbook of OpenBLAS vs @Elrod’s ApppleAcclerate.jl package. It seems to be as fast as or faster than threaded OpenBLAS without explicit threading. Of course, nobody outside of Apple knows what Accelerate really does, so it may use threads (or not).
I just registered LAPACK_jll with LAPACK 3.10 and the right 64_ suffixes for 64-bit systems. So, the path to making Accelerate as easy to use as MKL.jl is, is to have LBT point to it for BLAS and to LAPACK_jll for LAPACK.
We should do this in either AppleAccelerate.jl perhaps and revive it.