I think there’s a high level thing to note here about “what is a compiler” and “what a compiler is allowed to do” that has to be discussed here. When you write a code A*B + c
, is it supposed to do A*B
with BLAS and then tmp + c
as a broadcast? If Julia is an imperative programming language where you define and call functions, the answer is… yes, duh, that’s how those functions are defined. But, people who know BLAS interfaces knows that there is a call to gemm!
that does A*B + c
together, and it’s faster. So, is the compiler allowed to reinterpret A*B + c
as gemm!
?
You might immediately think, duh, yes, that’s faster so do it. But now we’re contradicting our first intuition, which is that a programmer in the language should know it calls the functions they’ve actually written. And in fact this demonstration is nice because the two are not actually equivalent: you will get subtle floating point differences between a fused gemm!
and an unfused A*B + c
.
So then it comes down to philosophy: is a compiler allowed to change the floating point semantics in this case? But it’s a really case by case thing, because for example it’s already done with SIMD and muladd
operations: both of those can change the floating point result, but a compiler can do this automatically. But you can see there’s different levels here:
- Julia and LLVM in general will try to SIMD automatically in loops when it can on the default
-o2
setting.
- Julia will not change
a*b + c
to muladd(a,b,c)
by default, though using @fastmath
or MuladdMacro.jl are ways to make this more automatic, with one of course being built into the compiler / LLVM and just enabled in the pass stack.
- Julia will not change
A*B + C
to a single gemm!
, but Reactant.jl will do that.
Should the compiler be allowed to change floating point of an output? If you say no, then it cannot auto-SIMD loops either: reductions like sum
would get a different value. So I think at some level we have already loosed the idea of an imperative programming a bit and we’re using Julia language semantics somewhat declaratively: I meant sum(x)
, but I didn’t mean what the Julia function is precisely written as, you can modify that as needed to be faster.
But, if the user writes a*b+c
, did they actually mean muladd(a,b,c)
? That really is treating the code as a declarative spec: you said one very precise thing (multiply and then add), but I’m going to interpret it as “multiply and add”, and I’m going to replace it with something mathematically equivalent because it’s what you really meant, right? Reactant.jl is then taking that to a completely different level, you didn’t mean “multiply the matrix A by B and then add C”, which is what the imperative programming language semantics mean, you meant "A*B + C`, and I’m going to replace that with an equivalent mathematical expression that tends to be a better way to evaluate that.
When should a compiler be allowed to do? That it’s a very good question. If you have well-defined higher level functions, then maybe the compiler should be allowed to assume not just how the implementation is, but also the “intent” of the user, and be able to swap implementation based on higher level “intent”. Though that breaks many principles of “you know how it’s computed”, which is generally not done in many ways in an imperative programming language, because normally if you write a function *
and +
then it will use it. With Reactant
it might just look at the LLVM and think “I know how to do that better”.
Because of this, I think there’s a lot of space to explore here. Not just in the ability to write such compilation passes, but also in the interface to the user. Should it be done by default or opt-in? If you want some passes but not others, can this be fine grained? It’s hard to tell how this evolves. But at least the safest way to built it out is to fully make it opt-in, which is Reactant.jl of today. But there have been prototypes of Julia to MLIR, such as a Brutus.jl and other things, so one major difference from Python is that it’s actually possible for something like Reactant to become just a standard part of the compiler some day. I think we still need to work out some interface ideas in order to really say that it should.
In the meantime, it’s really easy to opt-in and it does well. So that’s the current state, but likely not the last statement.