Today I discovered that @fastmath
is not applied to inlined function calls (maybe this should be in the docs?), I understand that there is a good reason behind it.
However I would have expected this to work on expressions returned by macros, but this doesn’t seem to be the case:
macro m_fma(a, b, c)
return esc(quote $a * $b + $c end)
end
f_fma(a, b, c) = a * b + c
@fastmath fma_macro(a, b, c) = @m_fma(a, b, c)
@fastmath fma_func(a, b, c) = f_fma(a, b, c)
@fastmath fma_direct(a, b, c) = a * b + c
Then when inspecting the code of each function:
julia> @code_llvm debuginfo=:none fma_macro(1.1, 2.2, 3.3)
define double @julia_fma_macro_246(double %0, double %1, double %2) #0 {
top:
%3 = fmul double %0, %1
%4 = fadd double %3, %2
ret double %4
}
julia> @code_llvm debuginfo=:none fma_func(1.1, 2.2, 3.3)
define double @julia_fma_func_248(double %0, double %1, double %2) #0 {
top:
%3 = fmul double %0, %1
%4 = fadd double %3, %2
ret double %4
}
julia> @code_llvm debuginfo=:none fma_direct(1.1, 2.2, 3.3)
define double @julia_fma_direct_250(double %0, double %1, double %2) #0 {
top:
%3 = fmul fast double %1, %0
%4 = fadd fast double %3, %2
ret double %4
}
@fastmath
was applied only on fma_direct
, but not on fma_macro
. I believe that it is because @fastmath
doesn’t expand macros:
julia> (@macroexpand @m_fma(1.1, 2.2, 3.3)) == (@macroexpand @fastmath @m_fma(1.1, 2.2, 3.3))
true
This gives rise to an irregular behavior, where a function written explicitly can be more performant that its macro counterpart.
I imagine this behavior is somewhat intended, but is it intuitive?
Since macros inject code into its callee, shouldn’t this code be applied to @fastmath
too?
The main problem I have with @fastmath
currently, is that this behavior is is not explicitly stated in the docs. Not propagating @fastmath
is a quite important thing to consider when coding for performance.
I originally encountered this issue when tracking down a difference of performance with a near-identical C++ code, and it took me quite a while to find the source.