Branch in dispatch on values

This has probably been discussed elsewhere, but I didn’t find it. Why do the runtimes of foo and bar functions below differ? Since w is constrained to be a boolean, I didn’t expect to see such a huge difference.

Let

function foo(x, w::Bool=true)
   _foo(x, Val(w)) 
end

_foo(x, ::Val{true}) = sum(exp.(x))
_foo(x, ::Val{false}) = sum(exp.(-x))

x = rand(10);

@btime foo($x)
@btime foo($x, true)
@btime foo($x, false)

  4.489 μs (1 allocation: 160 bytes)
  4.431 μs (1 allocation: 160 bytes)
  4.603 μs (2 allocations: 320 bytes)
5.962627209972655

Compare with:

function bar(x, w::Bool=true)
    if w
        return _foo(x, Val(true))
    else
        return _foo(x, Val(false))
    end
end

@btime bar($x)
@btime bar($x, true)
@btime bar($x, false)

  83.225 ns (1 allocation: 160 bytes)
  83.142 ns (1 allocation: 160 bytes)
  121.367 ns (2 allocations: 320 bytes)
5.962627209972655

bar is type stable, so no dispatch happens at run-time. Only an if statement. foo has dynamic dispatch which is slower. Theoretically, the compiler eventually could deal with this, but it’s hard to do in general.

2 Likes

There isn’t a special case for that ATM

1 Like