Performance optimization by NPUs

This is less about concrete implemention in Julia, and more about curiosity.

My understanding suggests, that NPUs would be suitable to calculate method dispatch, pattern matching, and certain other aspects of Julia.

Is this an option, or technically senseless for some reason?

The fixed dataflow patterns should probably be done on CPU, and the shared memory abstraction introduces complexity, and memory overhead, I guess.

Therefor, I see that they could do matrix operations for lookup tables, and are good at parallel method resolution?

Also type checking could be done with pattern matching, as by that, done by the NPUs?

P.S:

I understand that it makes no sense economically, as long as they are rare.
I am just interested, if this makes any sense technically.

Technically senseless. NPUs are fixed function hardware for very very specific classes of highly parallel collections of very low precision floating point arithmetic.

Imagine a GPU, but much more restrictive. Things like method dispatch and pattern matching are pretty much the polar opposite of what this sort of hardware is good for.

2 Likes