that absolutely would explain the regression and is a pretty bad performance bug.
Changing this to
@assume_effects :consistent @inline function two_mul(x::Float64, y::Float64)
if Core.Intrinsics.have_fma(Float64)
xy = x*y
return xy, fma(x, y, -xy)
end
# return Base.twomul(x,y)
end
and running under Revise seems to mostly fix the regression (~5% slower now compared to 50-100% slower). Interestingly, the regression doesn’t reappear in the same REPL session if I revert the change. I see the same result from test()
on 1.9, 1.10 and on 1.10 with the change.
julia> versioninfo()
Julia Version 1.10.0-rc1
Commit 5aaa948543 (2023-11-03 07:44 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 16 × AMD Ryzen 7 5700U with Radeon Graphics
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver2)
Threads: 1 on 16 virtual cores
It’s weirder than that.
julia> @btime x^y setup=begin x=rand(); y=rand()end;
42.232 ns (0 allocations: 0 bytes)
julia> @eval Base.Math @assume_effects :consistent @inline function two_mul(x::Float64, y::Float64)
if Core.Intrinsics.have_fma(Float64)
xy = x*y
return xy, fma(x, y, -xy)
end
return Base.twomul(x,y)
end
two_mul (generic function with 3 methods)
julia> @btime x^y setup=begin x=rand(); y=rand()end;
17.039 ns (0 allocations: 0 bytes)
Just forcing the compiler to re-evaluate exactly the same definition of two_mul
is enough to make it fast.
Hey! Thanks everyone for the reply. May I ask whether there was any resolution to this? I keep finding the problem of newer versions of Julia to be consistently slower on benchmarks like this (which can turn rather expensive for some ODE problems).
Yes, I think this one was addressed through Fix multiversioning issues caused by the parallel llvm work by gbaraldi · Pull Request #52194 · JuliaLang/julia · GitHub (and backported, too).
Please keep reporting any and all performance regressions — it’s very valuable to get that information.
Any specifics?