MWE (tested on Julia 1.11.5):
julia> foo1(::Val{N}) where {N} = N
foo1 (generic function with 1 method)
julia> foo2(n::Int) = Val(n)
foo2 (generic function with 1 method)
julia> a = Val.(rand(Int, 1000));
julia> using BenchmarkTools
julia> @benchmark for i in $a
i |> foo1 |> foo2
end
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 267.700 μs … 2.704 ms ┊ GC (min … max): 0.00% … 88.68%
Time (median): 280.400 μs ┊ GC (median): 0.00%
Time (mean ± σ): 284.462 μs ± 51.782 μs ┊ GC (mean ± σ): 0.67% ± 3.25%
▄██▅▁▂▁▁
▁▁▁▁▁▁▁▂▄▆▇█████████▄▄▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
268 μs Histogram: frequency by time 320 μs <
Memory estimate: 46.88 KiB, allocs estimate: 2000.
julia> @benchmark for i in $a
foo1(i) |> foo2
end
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 262.300 μs … 3.058 ms ┊ GC (min … max): 0.00% … 89.87%
Time (median): 274.600 μs ┊ GC (median): 0.00%
Time (mean ± σ): 278.443 μs ± 62.050 μs ┊ GC (mean ± σ): 0.69% ± 2.79%
▂▃█▆▆▃
▂▁▁▁▁▂▂▃▃▃▃▄▅▆████████▇▅▄▄▄▆▆▆▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
262 μs Histogram: frequency by time 301 μs <
Memory estimate: 46.88 KiB, allocs estimate: 2000.
julia> @benchmark for i in $a
foo2(i |> foo1)
end
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 152.900 μs … 2.155 ms ┊ GC (min … max): 0.00% … 90.02%
Time (median): 157.700 μs ┊ GC (median): 0.00%
Time (mean ± σ): 160.376 μs ± 44.962 μs ┊ GC (mean ± σ): 0.65% ± 2.20%
▁▂▆█▄▇▂▂
▂▂▂▃▃▄▄▅████████▇▇▅▄▄▃▃▃▃▃▂▃▃▃▃▃▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂ ▃
153 μs Histogram: frequency by time 174 μs <
Memory estimate: 31.25 KiB, allocs estimate: 1000.
julia> @benchmark for i in $a
foo2(foo1(i))
end
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 150.100 μs … 2.691 ms ┊ GC (min … max): 0.00% … 92.80%
Time (median): 157.000 μs ┊ GC (median): 0.00%
Time (mean ± σ): 159.191 μs ± 43.401 μs ┊ GC (mean ± σ): 0.57% ± 2.02%
▇█▄▃▁
▁▁▁▁▂▂▂▃▅██████▅▅▃▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
150 μs Histogram: frequency by time 184 μs <
Memory estimate: 31.25 KiB, allocs estimate: 1000.
julia> @benchmark for i in $a
i |> (foo2∘foo1)
end
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 56.100 μs … 403.200 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 57.700 μs ┊ GC (median): 0.00%
Time (mean ± σ): 58.965 μs ± 6.902 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▂█▄▅
▁▂▃████▇▄▄▃▂▂▂▂▃▃▂▂▂▁▂▁▂▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
56.1 μs Histogram: frequency by time 72 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
It’s not surprising that (foo2∘foo1)
gave the best performance, since when foo1
and foo2
were compiled together, the compiler could eliminate some of the internal conversion. However, it’s still not as fast as
julia> @benchmark for i in $a
i |> identity
end
BenchmarkTools.Trial: 10000 samples with 626 evaluations per sample.
Range (min … max): 187.859 ns … 389.457 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 188.498 ns ┊ GC (median): 0.00%
Time (mean ± σ): 191.519 ns ± 8.910 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃█▇▂▂▂▄▃▃▁ ▁▂▁ ▄▆▅ ▁▂▂▁ ▁ ▂
███████████▇███▇▇▇▇▅▅▅▄█████▆▇▆▇▇▇████▆▆▆▁▄▄▁▁▄▅▆▅▅▇▆▇██▅▁▁▇▇ █
188 ns Histogram: log(frequency) by time 205 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
I assume the reason is that currently we don’t have the ability to annotate non-Type
parameters in a parametric type:
julia> typeof(a)
Vector{Val} # not something like `Vector{Val{::Int}}`
Still, I wonder if the performance discrepancy (as much as they presented here) among these implementation details is expected? And is it possible for the gaps to be eliminated in the future? Thanks!!