`ifelse(x<y, x, y)` much faster than `min()`

julia> using Chairmarks

julia> function g(dij)
    vmin = Inf
    for x in dij
        vmin = min(x, vmin)
    end
    return vmin
end

julia> function f(dij)
    vmin = Inf
    for x in dij
        vmin = ifelse(x < vmin, x, vmin)
    end
    return vmin
end

julia> a = rand(512);

julia> @be g($a)
Benchmark: 2864 samples with 8 evaluations
 min    3.591 μs
 median 3.823 μs
 mean   3.800 μs
 max    7.078 μs

julia> @be f($a)
Benchmark: 2959 samples with 69 evaluations
 min    428.188 ns
 median 428.478 ns
 mean   430.873 ns
 max    1.153 μs

is this expected?

3 Likes

I get very different results on aarch64, where we use llvm fmin and fmax:

julia> @be g($a)
Benchmark: 3758 samples with 637 evaluations
 min    34.733 ns
 median 34.929 ns
 mean   36.422 ns
 max    73.391 ns

julia> @be f($a)
Benchmark: 4406 samples with 30 evaluations
 min    600.000 ns
 median 631.950 ns
 mean   679.569 ns
 max    2.093 μs

What happens if you try to do the same on your machine (which I presume is x86_64)?

Timings seem to be all over the place. This is in aarm64 (M3):

julia> @be f($(x))
Benchmark: 4028 samples with 45 evaluations
 min    455.556 ns
 median 490.733 ns
 mean   510.856 ns
 max    822.222 ns

julia> @be g($(x))
Benchmark: 3844 samples with 103 evaluations
 min    217.233 ns
 median 228.553 ns
 mean   234.833 ns
 max    424.757 ns

Your g seems to be still faster than f? What’s your input? You seem to be using at least a different name for the input (x instead of a) than what shown in the original post. Also, what version of Julia are you using? The code I showed above was introduced in Julia v1.10 with Use native fmin/fmax in aarch64 by gbaraldi · Pull Request #47814 · JuliaLang/julia · GitHub, if you have an older version of Julia you shouldn’t seem the same speedup.

Notice that @gragusa’s results are also against the advantage of ifelse over min. In that case, f (the version with ifelse) is reported first, and shows longer times.

1 Like

ok so looks like on aarch64, min() is faster than ifelse(), while on x86_64 (AMD 7840HS) it’s the other way around

But what if you use llvm min/max on x86_64?

Which is what I said? But I thought that their point was that their g wasn’t as fast as mine, and I tried to suggest where to look at (for example Julia version).

julia> function g_llvm(dij)
           vmin = Inf
           for x in dij
               vmin = llvm_min(x, vmin)
           end
           return vmin
       end
g_llvm (generic function with 1 method)

julia> Base.@assume_effects :total @inline llvm_min(x::Float64, y::Float64) = ccall("llvm.minimum.f64", llvmcall, Float64, (Float64, Float64), x, y)
llvm_min (generic function with 1 method)

julia> @be g_llvm($a)
LLVM ERROR: Cannot select: 0xe841c70: f64 = fminimum 0xe842610, 0xe842760, REPL[2]:1 @[ REPL[1]:4 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ]
  0xe842610: f64,ch = load<(load (s64) from %ir.uglygep, !tbaa !51, !alias.scope !54, !noalias !55)> 0xeb3e8c0, 0xe841d50, undef:i64, essentials.jl:917 @[ array.jl:891 @[ REPL[1]:5 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ] ]
    0xe841d50: i64 = add 0xe842140, 0xe841ff0, essentials.jl:917 @[ array.jl:891 @[ REPL[1]:5 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ] ]
      0xe842140: i64,ch = CopyFromReg 0xeb3e8c0, Register:i64 %2, essentials.jl:917 @[ array.jl:891 @[ REPL[1]:5 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ] ]
        0xe842840: i64 = Register %2
      0xe841ff0: i64 = shl 0xe841ea0, Constant:i8<3>, essentials.jl:917 @[ array.jl:891 @[ REPL[1]:5 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ] ]
        0xe841ea0: i64,ch = CopyFromReg 0xeb3e8c0, Register:i64 %4, essentials.jl:917 @[ array.jl:891 @[ REPL[1]:5 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ] ]
          0xe842530: i64 = Register %4
        0xe841960: i8 = Constant<3>
    0xe841f80: i64 = undef
  0xe842760: f64,ch = CopyFromReg 0xeb3e8c0, Register:f64 %5, REPL[2]:1 @[ REPL[1]:4 @[ /home/akako/.julia/packages/Chairmarks/msbgo/src/macro_tools.jl:52 ] ]
    0xe841f10: f64 = Register %5
In function: julia_#3_3347

you can’t

Uh, I thought I had tried it at some point and it worked :thinking:

Ah, I think I tried llvm.minnum.f64, that’s what clang uses: Compiler Explorer. This reminded me I had asked a question about llvm.minnum vs llvm.minimum on slack some time ago.