Apple silicon full power

IMHO:
The first step has been started: Reverse Engineering

Apple Matrix coprocessor - Reverse Engineering

...
# AMX: Apple Matrix coprocessor
#
# This is an undocumented arm64 ISA extension present on the Apple M1. These
# instructions have been reversed from Accelerate (vImage, libBLAS, libBNNS,
# libvDSP and libLAPACK all use them), and by experimenting with their
# behaviour on the M1. Apple has not published a compiler, assembler, or
# disassembler, but by callling into the public Accelerate framework
# APIs you can get the performance benefits (fast multiplication of big
# matrices). This is separate from the Apple Neural Engine.
#
# Warning: This is a work in progress, some of this is going to be incorrect.
#
# This may actually be very similar to Intel Advanced Matrix Extension (AMX),
# making the name collision even more confusing, but it's not a bad place to
# look for some idea of what's probably going on.
...

Apple M1 Neural Engine - Reverse Engineering

Apple M1 GPU - Reverse Engineering

And as usual - adding “reverse engineering” for the keywords … you can check the latest status

Related thread:

4 Likes