Massive performance penalty for Float16 compared to Float32

ScottPJones · November 4, 2017, 12:16pm

Intel added vector instructions to do conversions to/from 16-bit floats many years ago, and in fact, showed that (because of using half the memory, better cache utilization) that using 16-bit could be faster than 32-bit, for larger operations, and not that much slower for smaller vectors.

https://software.intel.com/en-us/articles/performance-benefits-of-half-precision-floats

It seems that making sure that Julia can use the SIMD instructions when doing vector operations on 16-bit floats could acheive some nice performance benefits.

Topic		Replies	Views
Time it takes to multiply two floats Numerics float	2	424	August 18, 2022
Apples to apples comparison of A\b with Float64 and Float16 on A64FX Performance question , linearalgebra	12	769	May 2, 2022
Worse performance of LU-decomposition for Float32 than for Float64 Performance linearalgebra	7	443	November 5, 2020
Int numerical calculation speed slower than Float? Performance	12	1738	February 17, 2020
Performance of Float32 exponential Performance	4	1448	December 21, 2019

Massive performance penalty for Float16 compared to Float32

Related topics