X * y + z does not automatically use FMA instruction

Because it gives a different result. You can explicitly tell the compiler that you are fine with that with @fastmath or by using either the muladd or fma function. See also the recent discussion How to enable vectorized fma instruction for multiply-add vectors?.

8 Likes