Yeah, for sure, Int types (especially low-precision like Int4) are crazy cheap in area, power usage, etc. compared to Float16 or Float32. And at the extreme, binarizing a neural network can achieve an extra order of magnitude power and perf saving since multiplications get converted into adds, etc.
[2002.07795] Verified Instruction-Level Energy Consumption Measurement for NVIDIA GPUs
For GPU, Energy Consumption:
- Double Precision > Half Precision > Single Precision ~ Integer
- signed > unsigned
- GPU architecture also has a significant impact
Of course, there is still space for improvement in half-precision.
1 Like