I’m trying to get the division first, but the Julia code generation puts vdivsd %xmm0, %xmm1, %xmm1
very late:
_sinc_threshold(::Type{Float64}) = 0.001
@inline function _sinc2(x::Union{T,Complex{T}}) where {T<:Union{Float32,Float64}}
invx=1.0/x; a = abs(x)
if a < _sinc_threshold(T) return evalpoly(x^2, (T(1), -T(pi)^2/6, T(pi)^4/120)) else return invx*sinpi(x) end
end
@code_native _sinc2(11.4)
My guess is that Julia prefers the much cheaper evalpoly code path, even with it very unlikely to be taken. It’s not helpful to filp the condition. Any way to force my first statement to actually schedule first?
This at least does something as you say on Julia 1.5.1 (with this modification), not on 1.6, and I’m not sure where to apply it (or fix for 1.6):
julia> @inline function expect(b::Bool)
Core.Intrinsics.llvmcall(("declare i1 @llvm.expect.i1(i1, i1)", """
%b = trunc i8 %0 to i1
%actual = call i1 @llvm.expect.i1(i1 %b, i1 true)
%byte = zext i1 %actual to i8
ret i8 %byte
"""), Bool, Tuple{Bool}, b)
end
expect (generic function with 1 method)
julia> expect(true)
ERROR: Module IR does not contain specified entry function
Stacktrace:
[1] expect(b::Bool)
@ Main ./REPL[112]:2
[2] top-level scope
@ REPL[113]:1
Anyway, ifelse, seemed to work, while the division wasn’t the exact first instruction 8maybe a good choice). It’s good to know, while it didn’t really help me optimize in this case.
Is this effective (i.e., changes the machine code in a meaningful way)? I think I’ve read somewhere (probably in julia’s issue tracker) that the optimization pass for this is not enabled in julia.
@inline function expect(b::Bool)
Bsae.llvmcall((" declare i1 @llvm.expect.i1(i1, i1)\n\n define i8 @entry(i8) alwaysinline {\n top:\n %b = trunc i8 %0 to i1\n%actual = call i1 @llvm.expect.i1(i1 %b, i1 true)\n
%byte = zext i1 %actual to i8\nret i8 %byte\n }\n", "entry"), Bool, Tuple{Bool}, b)
end
@tkf
That would explain why I hadn’t notice it make any real differences.
Perhaps it does through preventing optimizations, but I wouldn’t exactly call that desirable in general.
since Julia is a dynamic language, I’d imagine people are naturally more conscious about putting the most hit / possible branch at the top. Besides, Julia doesn’t have a switch case anyways.