Or is it possible to do that in some hard (not very) way?
Raspberry Pi AI Kit – is a Raspberry Pi with Hailo AI acceleration module on top.
The context is that I was thinking about possibility to exploit ML models with stack Flux/Julia/“Raspberry Pi AI Kit”( for inference only). Of course I could export to ONNX and use it in TensorFlowLite, but may be there is a way to stay with Julia even in deployment phase. So the question maybe related more to “Whether Flux supports that thing?” , but may be there exist other solutions which lie beyond just Flux but still within Julia. So, I’d be appreciated to any experience/story related to this.
Most certainly does not, because it’s quite new, and also wouldn’t be Julia’s place nor needed, since it would likely work without any specific Julia support. Note, while Raspberry Pi is known to work with Julia, it doesn’t have any sort of official tier 1, or 2… support.
What you need is to have this $70 add-on AI chip I didn’t know of (thanks!), and do on Raspberry Pi:
$ sudo apt install hailo-all
As expected this didn’t work for me in Linux Mint on non-Pi, but hypothetically could have:
Host Architectures: X86, ARM
OSs: Linux, Windows
AI frameworks: TensorFlow, TensorFlow Lite, Keras, PyTorch & ONNX
Flux.jl or Lux.jl would need to support the chip or call some know API of it, or more likely(?) Keras or PyTorch etc. You can use all of those through Python, and with PythonCall.jl (and presumably on the Pi too, unless it’s really that memory limited to not support Julia runtime plus the Python runtime).
I’m not optimistic that Julia packages will support this soon, except if by calling the Python support, than I’m just not sure if Julia can be said to do any useful work. But AI is the future, also with Copilot+ PCs… so it will be interesting to see how they and this and Apple NPUs etc. get supported, I suppose will happen eventually.
It supports “13 Tera-Operations Per Second (TOPS)” is that good (enough) for something; compared to:
Apple has stated the Neural Engine in the M4 can perform 38 trillion operations per second (TOPS), an improvement over the 18 TOPS in the M3.
Actually I think it’s 36 not 38 TOPS, and what TOPS means matters, is it bfloat operations or 8-bit floats or whatever, likely only integer operations 8-bit, since TOPS, otherwise all would state TFLOPS??
With powerful new silicon capable of an incredible 40+ TOPS (trillion operations per second)