X * y + z does not automatically use FMA instruction

StefanKarpinski · June 23, 2023, 2:16pm

One subtle issue here that is discussed on GitHub but hasn’t been bought up here is that there isn’t one obvious way to apply this optimization. If you write a*b + c*d there’s two ways to apply FMA:

fma(a, b, c*d)
fma(c, d, a*b)

Which one gets done by LLVM? Well, that’s hard to predict because LLVM knows that + is commutative, so it might have changed a*b + c*d into c*d + a*b by the time the FMA transformation gets done. So by enabling automatic FMAs, you’ve taken code that’s completely unambiguous and deterministic—the naive implementation of a*b + c*d as two multiplies followed by an add—and changed it to something where what you’re computing depends on whims of this particular version of the compiler and could change if we upgrade LLVM or choose a different set or ordering of optimization passes.

Note that there’s no such problem if you explicitly write fma in your code: then it’s completely unambiguous what you want to compute and will always be the same. It also seems fine to me if you’ve used @fast_math to explicitly give the compiler permission to compute something a bit different than what you wrote if it deems it faster. In such cases, we understand that the compiler is making a judgement and that this judgement might change. But we try very hard not to do that kind of thing by default in Julia—we compute what you asked for, not something else.

Topic		Replies	Views
Why doesn’t Julia fma automatically? General Usage	43	2598	October 18, 2023
Use different methods depending on presence of FMA Internals & Design	21	2253	July 1, 2017
@fastmath switched off in Julia 0.7-Dev, deliberately? Performance fast-math	7	1778	June 24, 2018
Inaccurate Matrix Multiplication General Usage linearalgebra	39	3452	February 28, 2022
@fastmath macro accuracy General Usage numerics , fast-math	26	11243	May 12, 2020

X * y + z does not automatically use FMA instruction

Related topics