Another thread made me remember this and consider if @btime is the problem here, but unlike that example, @timeing a loop actually corroborates the 3-4x difference in @btime, not the matching @code_native/@code_llvm:
julia> @time for i in 1:10_000
f(i÷i*100) # prevent constant hoisting
end
0.181906 seconds (20.00 k allocations: 312.500 KiB)
julia> @time for i in 1:10_000
f2(i÷i*100) # prevent constant hoisting
end
0.056144 seconds (20.00 k allocations: 312.500 KiB)
julia> 0.181906/0.056144
3.239990025648333