This discussion may be of interest.
I haven’t played around with union dispatching in 0.7 yet. On an 8-day old master:
julia> using BenchmarkTools, Random
julia> x = randn(20);
julia> u = Vector{Union{Float64, Float32}}(x);
julia> @benchmark exp($x[1])
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 4.207 ns (0.00% GC)
median time: 4.228 ns (0.00% GC)
mean time: 4.286 ns (0.00% GC)
maximum time: 19.357 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark exp($u[1])
BenchmarkTools.Trial:
memory estimate: 32 bytes
allocs estimate: 2
--------------
minimum time: 24.061 ns (0.00% GC)
median time: 30.399 ns (0.00% GC)
mean time: 36.888 ns (15.74% GC)
maximum time: 38.634 μs (99.92% GC)
--------------
samples: 10000
evals/sample: 996
julia> @benchmark f($x[1])
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 1.212 ns (0.00% GC)
median time: 1.213 ns (0.00% GC)
mean time: 1.221 ns (0.00% GC)
maximum time: 14.737 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark f($u[1])
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 4.077 ns (0.00% GC)
median time: 4.107 ns (0.00% GC)
mean time: 4.107 ns (0.00% GC)
maximum time: 19.216 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
Versus 0.6.2:
julia> using BenchmarkTools, Random
julia> x = randn(20);
julia> u = Vector{Union{Float64, Float32}}(x);
julia> @benchmark exp($x[1])
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 7.582 ns (0.00% GC)
median time: 7.753 ns (0.00% GC)
mean time: 7.932 ns (0.00% GC)
maximum time: 20.750 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark exp($u[1])
BenchmarkTools.Trial:
memory estimate: 16 bytes
allocs estimate: 1
--------------
minimum time: 25.851 ns (0.00% GC)
median time: 26.194 ns (0.00% GC)
mean time: 27.828 ns (2.21% GC)
maximum time: 965.944 ns (93.62% GC)
--------------
samples: 10000
evals/sample: 996
julia> f(x::Float64) = 2x
f (generic function with 1 method)
julia> f(x::Float32) = 2+x
f (generic function with 2 methods)
julia> @benchmark f($x[1])
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 1.442 ns (0.00% GC)
median time: 1.463 ns (0.00% GC)
mean time: 1.466 ns (0.00% GC)
maximum time: 16.591 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark f($u[1])
BenchmarkTools.Trial:
memory estimate: 16 bytes
allocs estimate: 1
--------------
minimum time: 11.675 ns (0.00% GC)
median time: 12.719 ns (0.00% GC)
mean time: 13.708 ns (4.71% GC)
maximum time: 999.245 ns (95.45% GC)
--------------
samples: 10000
evals/sample: 998
For comparison, cost of an if
statement is close to 1 ns. With if statements, you also don’t have to worry about squashing type instability that can result from dynamic dispatches.
If there’s some sort of pattern that lets you use Base.Cartesian.@nif
, you could still be relatively concise with the control flow.