Performance difference between optional args and keyword args

Is there any (theoretical) difference in performance between functions defined with optional arguments and functions defined with keyword arguments? The manual says:

Keyword arguments behave quite differently from ordinary positional arguments. In particular, they do not participate in method dispatch. Methods are dispatched based only on positional arguments, with keyword arguments processed after the matching method is identified.

Does that imply a performance hit? Here’s an experiment:

using BenchmarkTools

const x = rand(10_000, 10_000)
const y = rand(10_000, 10_000)

function f_opt(x, y=y)
    sum((x .> y) .+ y)
end

function f_kw(x; y=y)
    sum((x .> y) .+ y)
end

@benchmark f_opt(x)
BenchmarkTools.Trial: 
  memory estimate:  762.94 MiB
  allocs estimate:  2
  --------------
  minimum time:     452.869 ms (0.00% GC)
  median time:      493.875 ms (4.59% GC)
  mean time:        509.880 ms (8.53% GC)
  maximum time:     566.466 ms (17.77% GC)
  --------------
  samples:          10
  evals/sample:     1

@benchmark f_kw(x)
BenchmarkTools.Trial: 
  memory estimate:  762.94 MiB
  allocs estimate:  2
  --------------
  minimum time:     456.765 ms (0.00% GC)
  median time:      491.881 ms (4.25% GC)
  mean time:        507.576 ms (8.43% GC)
  maximum time:     565.860 ms (17.08% GC)
  --------------
  samples:          10
  evals/sample:     1

It seems that keywords dominate optional arguments by a little. I find that strange, hence the question.

I think what you are seeing is no significant difference. In my machine, the keyword version is slightly slower:

julia> @benchmark f_opt(x)
BenchmarkTools.Trial: 
  memory estimate:  762.94 MiB
  allocs estimate:  2
  --------------
  minimum time:     326.368 ms (0.00% GC)
  median time:      341.366 ms (0.00% GC)
  mean time:        350.712 ms (5.19% GC)
  maximum time:     382.587 ms (10.87% GC)
  --------------
  samples:          15
  evals/sample:     1

julia> @benchmark f_kw(x)
BenchmarkTools.Trial: 
  memory estimate:  762.94 MiB
  allocs estimate:  2
  --------------
  minimum time:     329.881 ms (0.00% GC)
  median time:      359.059 ms (3.19% GC)
  mean time:        366.853 ms (5.59% GC)
  maximum time:     414.339 ms (10.14% GC)
  --------------
  samples:          14
  evals/sample:     1

I think a considerable difference is only felt for functions doing much less effort, but I did some tests with unitary arrays and also just numbers, and the difference seems to be always small and slightly in favor of the positional alternative. I am using Julia 1.5.3.

I was thinking that the difference could be due to the GC. Also I struggled to come up with examples. Can you provide yours? Oh, and I’m using Julia 1.6.0.

I think good performance comes from specialization, i.e. compiling specialized code based on the type of the argument, which is separate from dispatch. (To see that it’s separate, consider a function with a single method, eg f(x) = x^2; it still needs to compile different code for Float64’s and matrices, say, even though there’s just one method).

My understanding is Julia still specializes on keyword arguments although it does not dispatch on them.

1 Like

So in theory there shouldn’t be much difference between f_opt and f_fw.

Yes, I think that’s right. Sometimes Julia uses heuristics to decide when to specialize (since more specialization leads to more compilation time and sometimes it’s not needed), and I’m not totally sure how that works / interacts with keyword arguments. But I think generally speaking they shouldn’t be slower than positional arguments.

This is correct but keyword arguments until recently had some limitations that made the compiler infer worse what is related, see: Add keyword argument constant propogation to News by ChrisRackauckas · Pull Request #36292 · JuliaLang/julia · GitHub

I am not sure of that. If I remember correctly, there was a kw_sorter generic function that had to be called receiving the NamedTuple arguments to fill in the defaults and then pass to the under-the-hood method that has no keyword arguments (only positional ones). I think that for larger lists of keyword parameters in which some of them are explicit and other are missing, the function may take a performance hit (but if the function makes any considerable effort this ends up being a little overhead).

4 Likes

Hmm, if I wanted to extract every bit of performance, then I should use optional arguments.

Yes, but I would not sweat it too much. If your function does operations over entire vectors any overhead in the argument passing will probably be too little to worry about. I would just take a more little care if all the arguments are scalars.

3 Likes