Performance of typed keyword arguments

performance

#1

Hello, I have tried this code and I am really confused why the keyword version is slo slow even when I provide a type to it.

using BenchmarkTools

f1(x) = exp(x)
f2(x::Number) = exp(x)
f3(x::Float64) = exp(x)

v1(;x=2.5) = exp(x)
v2(;x::Number=2.5) = exp(x)
v3(;x::Float64=2.5) = exp(x)

@btime f1(2.5)
@btime f2(2.5)
@btime f3(2.5)

@btime v1(x = 2.5)
@btime v2(x = 2.5)
@btime v3(x = 2.5)

This gives me the following on 0.6.2 version

  54.172 ns (0 allocations: 0 bytes)
  54.172 ns (0 allocations: 0 bytes)
  54.172 ns (0 allocations: 0 bytes)

  259.656 ns (2 allocations: 112 bytes)
  430.980 ns (2 allocations: 112 bytes)
  175.290 ns (1 allocation: 96 bytes)

and this are the results from 0.7.0 alpha

  2.793 ns (0 allocations: 0 bytes)
  2.793 ns (0 allocations: 0 bytes)
  2.793 ns (0 allocations: 0 bytes)

  52.416 ns (0 allocations: 0 bytes)
  52.416 ns (0 allocations: 0 bytes)
  52.416 ns (0 allocations: 0 bytes)

So the new named tuples performs approximately as fast as old normal keyword, however, the new non-keyword version is still much faster…

Can someone explain me this behavior. Does this mean I always have to use normal arguments instead of keyword for performance critical code?


#2

I think the 0.7 benchmarks for the f-functions are so fast because of the new constant propagation feature of the compiler. If you do instead:

julia> a = 2.5                                                                                                                                                          
2.5                                                                                                                                                                     

julia> f1(x) = exp(x)                                                                                                                                                   
f1 (generic function with 1 method)                                                                                                                                     
                                                                                                                                                                        
julia> @btime f1(2.5);                                                                                                                                                  
  1.686 ns (0 allocations: 0 bytes)                                                                                                                                     
                                                                                                                                                                        
julia> @btime f1($a);                                                                                                                                                   
  11.654 ns (0 allocations: 0 bytes)                                                                                                                                    

The 11.6ns is in line with what I get on 0.6 and also the same as I get for keywords on 0.7:

julia> v1(;x=2.5) = exp(x)                                                                                                                                              
v1 (generic function with 1 method)                                                                                                                                     
                                                                                                                                                                        
julia> @btime v1(x=$a);                                                                                                                                                 
  11.652 ns (0 allocations: 0 bytes)                                                                                                                                    

So, keywords are as fast as positional arguments in 0.7 (at least for this test). However the constant propagation which makes f1(2.5) even faster does not seem to work for keywords.


#3

Ah, ok, that is a good explanation. Would be nice to know, why the propagation does not work on keyword arguments, though.


#4

Results from 0.7-alpha on Windows 10:

@btime f1(2.5)
@btime f2(2.5)
@btime f3(2.5)

@btime v1(x = 2.5)
@btime v2(x = 2.5)
@btime v3(x = 2.5)

  1.282 ns (0 allocations: 0 bytes)
  1.282 ns (0 allocations: 0 bytes)
  1.282 ns (0 allocations: 0 bytes)
  9.237 ns (0 allocations: 0 bytes)
  9.237 ns (0 allocations: 0 bytes)
  9.237 ns (0 allocations: 0 bytes)