I’ve got a function which is exhibiting a change to the numerical return value when comparing Julia versions 1.2 through 1.4 versus the 1.5beta (and master). I think this is actually an LLVM “error” (or a misunderstanding on my part, but one which has an impact at the LLVM optimization stage), but it may be something that only shows up from Julia where single-instruction fast-math annotations are easy to make.
So my problem case is demonstrated with the following Julia function:
function coeff_α(l::Integer, m::Integer)
lT = convert(Float64, l)
mT = convert(Float64, m)
fac1 = (2lT + 1) / ((2lT - 3) * (lT^2 - mT^2))
fac2 = 4*(lT - 1)^2 - 1
@fastmath return sqrt(fac1 * fac2)
end
The unoptimized LLVM IR shows that just the final multiplication of fac1 * fac2
and then subsequent sqrt
are marked as fast math instrunctions:
julia> @code_llvm debuginfo=:none optimize=false coeff_α(1,1)
define double @"julia_coeff_\CE\B1_1317"(i64, i64) {
top:
%2 = call %jl_value_t*** @julia.ptls_states()
%3 = bitcast %jl_value_t*** %2 to %jl_value_t**
%4 = getelementptr inbounds %jl_value_t*, %jl_value_t** %3, i64 4
%5 = bitcast %jl_value_t** %4 to i64**
%6 = load i64*, i64** %5
%7 = sitofp i64 %0 to double
%8 = sitofp i64 %1 to double
%9 = fmul double 2.000000e+00, %7
%10 = fadd double %9, 1.000000e+00
%11 = fmul double 2.000000e+00, %7
%12 = fsub double %11, 3.000000e+00
%13 = fmul double %7, %7
%14 = fmul double %8, %8
%15 = fsub double %13, %14
%16 = fmul double %12, %15
%17 = fdiv double %10, %16
%18 = fsub double %7, 1.000000e+00
%19 = fmul double %18, %18
%20 = fmul double 4.000000e+00, %19
%21 = fsub double %20, 1.000000e+00
%22 = fmul fast double %17, %21
%23 = call fast double @llvm.sqrt.f64(double %22)
ret double %23
}
For Julia versions 1.2 through 1.4, the optimized LLVM IR continues to only mark those two instructions with the fast
annotation, but starting with Julia 1.5, the earlier division is reordered to later in the function and made a fast instruction as well:
julia> @code_llvm debuginfo=:none optimize=false coeff_α(1,1)
define double @"julia_coeff_\CE\B1_1335"(i64, i64) {
top:
%2 = sitofp i64 %0 to double
%3 = sitofp i64 %1 to double
%4 = fmul double %2, 2.000000e+00
%5 = fadd double %4, 1.000000e+00
%6 = fadd double %4, -3.000000e+00
%7 = fmul double %2, %2
%8 = fmul double %3, %3
%9 = fsub double %7, %8
%10 = fmul double %6, %9
%11 = fadd double %2, -1.000000e+00
%12 = fmul double %11, %11
%13 = fmul double %12, 4.000000e+00
%14 = fadd double %13, -1.000000e+00
%15 = fmul fast double %14, %5
%16 = fdiv fast double %15, %10
%17 = call fast double @llvm.sqrt.f64(double %16)
ret double %17
}
I’ve entered the unoptimized LLVM IR into the godbolt online compiler tool and found that the change corresponds to the fact that Julia 1.4 uses LLVM 8 while Julia 1.5 uses LLVM 9, and the change is reproducible with LLVM’s built-in optimization pipeline.
(If you edit the input IR and remove the fast
from the fmul
instruction, the two compiler versions also agree, so it seems to be critically linked to the fmul fast
specifically.)
So my questions are:
- Is this just a misunderstanding on my part about how the
@fastmath
instructions are supposed to work? I expect the two initially-marked instructions to remain as the onlyfast
instructions in the LLVM IR, but maybe that assumption is wrong and the fast annotation is allowed to “grow” under certain circumstances? - If this is unexpected behavior, how should I go about reporting?
P.S. I already have a workaround — change @fastmath sqrt(fac1 * fac2)
to just @fastmath(sqrt)(fac1 * fac2)
since my purpose is to avoid the checks for negative square roots (and subsequent throw) which I know mathematically won’t happen given how the function is called.