Performance implications of computations in keyword arguments

Long story short, I’m putting my entire function into default values of keyword arguments, and I want to know if this is bad.

I’m working on a Julia package to solve PDEs. The “driver code” of such solvers often has a zoo of options/customization. I’ve taken to using a certain unusual design pattern and I’m wondering if there are any implications in terms of the performance, precompilation or whatever. Let’s say that you want to make a function that computes the first few terms of the Fibonacci sequence.

function fib(;x0 = 1,
    x1 = 1,
    x2 = x0+x1,
    x3 = x1+x2,
    x4 = x2+x3,
    x5 = x3+x4)
    return [x0,x1,x2,x3,x4,x5]
end

So now I can do fib() to get a default behavior, or fib(;x2=5) to get some strange variant. I can also cascade these functions to provide increasing levels of complexity:

fib_variant(;args...) = fib(x3=9;args...)

In other words, the body of my function is almost nothing and most of my computation is done in keyword parameters. This way, people can start using my code without having to understand all the parameters, just by doing fib(), and as they get deeper into whatever problem, they can modify the behavior of my function in very general ways.

Is this bad in some way?

S

2 Likes

It depends on your goal. If you plan to use this function in a hot, inner loop where it is called millions of times, then it is better to avoid keyword arguments because they can introduce additional memory allocations and execution time compared to positional arguments.

In addition you return an array which is allocating.

But in many cases readability is more important than performance.

Always good to benchmark your functions:

julia> using BenchmarkTools

julia> @btime fib()
  12.817 ns (2 allocations: 112 bytes)
6-element Vector{Int64}:
 1
 1
 2
 3
 5
 8

How? Calls with keyword arguments are lowered to a Core.kwcall call with a positional NamedTuple argument e.g. blah(Bool, 2; a=Int, b=4) lowers to Core.kwcall((a=Int, b=4), blah, Bool, 2). That’s just as inferrable and specializable as any other call with positional arguments, except keyword arguments annotated with Type{T} in the method will lose information as a ::DataType field in the NamedTuple.

Function call lowering is able to accept keyword arguments preceding ;, but that’s bad style. For clarity, either don’t use ; in calls or put all of your keyword arguments after ; the same way ; separates positional and keyword arguments in method definitions. The flip side is that variables (not literals or other expressions) “without” keywords after the ; is idiomatic and will be treated as keyworded arguments e.g. f(;c) is treated as f(;c=c), but f(;2c) fails.

3 Likes

One way in which this pattern is bad is high stack frame depth. This has several negative implications, including:

  • ugly stack traces

  • the compiler might be more likely to give up on inference in the presence of recursion

There’s nothing wrong with exposing keyword arguments to the user, but IMO it’s better to only accept kwargs at the top level of the interface, then store the kwargs some object, e.g., a NamedTuple, which can then be passed on to your internal functions. EDIT: just looked into it, this is already basically what kwcall does.

Kwargs do have extra costs for the compiler, but, when all goes well, everything should get optimized out.

3 Likes

Thanks for the replies. I hadn’t thought of the run-time cost, I just ran a quick test:

function fib2()
    x0 = 1
    x1 = 1
    x2 = x0+x1
    x3 = x1+x2
    x4 = x2+x3
    x5 = x3+x4
    return [x0,x1,x2,x3,x4,x5]
end
using BenchmarkTools
@btime fib2()

This benchmark result:

8.250 ns (2 allocations: 112 bytes)

By comparison, @btime fib() outputs:

8.208 ns (2 allocations: 112 bytes)

so I’m not seeing much performance difference between kwarg extravaganza and plain old function for this problem.

But my question was more about the infamous “plotting problem” I think it was called? Where pyplot.plot() used to take minutes before the first plot came out because of precompilation or very high level of polymorphism.

At the very least, a blah call would lower to a Core.kwcall call that forwards to a hidden function like var"#blah#11" that holds the body of the written blah method. Annoying for non-recursive reflection like @code_warntype too.

The 0.042ns difference is just runtime noise, there isn’t actually a difference. As explained earlier, keyword arguments are as optimized as positional arguments except for Type{T} annotations.

That is “time-to-first-plot” or more generally compilation latency, which has nothing to do with keyword arguments, and it’s not just “pre”-compilation either. It wouldn’t be affected if you redesigned the entire plotting API with positional arguments (and make users have to do a lot more writing).

Expand for yet another essay about compilation latency and it is actually addressed.

You’re probably aware already, but Julia is one of many languages that compiles to native code with optimizations. Most of those languages however have a completely separate compilation phase before most or usually all of the code is even executed; that’s why the methods need to at least start with concrete type specifications somewhere for the compiler to statically determine the types to even start compiling. Take your fib function for example; in isolation, an ahead-of-time compiler could compile it for all Int inputs based on the defaults, but it has no idea about fib(x1 = 1.0) or fib(x0=big(1), x1=big(1)) or the infinite number of other input types you could spontaneously call at runtime. Unlike many methods in statically typed languages, all written methods in Julia are generic, so like generics and templates in those languages, the compilation depends on the call’s context instead of the definitions alone. Something like a main function could specify some calls ahead of time, but not everything for interactive coding. So, Julia is implemented as a just-in-time compiled language; specifically it’s more of a just-ahead-of-time method-based JIT, so it more or less compiles the same way as AOT compilers for programs but for method calls on demand.

The drawback of JITing everything is that there isn’t a separate compilation phase that saves native code, so you have to redo compilation every time you restart a Julia process. For a while, this is exactly how Julia typically worked, which isn’t so bad for small packages but becomes noticeable (seconds, minutes) for big packages like plotting. There are interactive coding practices to avoid restarting a process for a long time, but sometimes we must e.g. a package got irreversibly loaded with the wrong or obsolete version.

Saving compilation is not as trivial as a savesession function because it’s actually bad to indiscriminately save the native code of every prototype call in a session. This is where precompilation comes in, specifically the mechanism in v1.9+; each package can specify calls to precompile, and precompilation can occur for the package when it’s added to an environment that figured out what versions of the dependencies are compatible with the other packages. After you reactivate that environment in a new process, importing the package loads the precompiled native code directly. If the environment is changed e.g. updating the package, precompilation is triggered again (and again, you may have to restart the process if obsolete versions were already loaded). Previously precompiled packages are reused in a changed environment if they didn’t need to change, but note that like any other package manager, it will change a lot until everything is version-compatible and non-redundant. This isn’t like manually linking dynamic libraries that may risk incompatible duplicate code.

Many of the highly used packages opted into significant precompilation phases for what they expect to be frequently used calls, so compilation latency has improved a lot in practice. Still, not everything can be anticipated by isolated package developers; any user can mix them in very specific ways that may not map cleanly to a new package. If it’s an everyday tool whose versions can remain stable for a long time, then people may bake precompiled calls into a custom system image for the Julia process to start up with. If it’s more situational, people may repurpose packages to import a custom list of precompiled calls. If it’s more of a fixed script, people may repurpose packages to precompile a main function (since 1.11, we’re starting to use a @main system). juliac is in development for trimmed binaries that don’t need the entire Julia process. There are many layers to saving compilation, and it’s still actively developed.

2 Likes

OK I think I’m getting the impression that putting the whole function into default values of kwargs is not a big deal. Thanks!

I had a very similar use case - wanted to have a function used by people without going into the details unless they want to. I didn’t know if kwargs is the best option so I had a thread.

In the end I went with custom structures and turned out to be the great choice. Of course it depends on your level of complexity, for this function you give above is an overkill probably but for me it quickly went out of hand with keywords with default values.

Thanks for your advice, I read your thread. This was indeed the alternative for me, a structure to hold all the parameters of the solver, and a constructor for the structure with default values. I prefer the kwargs approach so I came here to ask.

I prefer the kwargs approach for a couple of reasons. I find that fib() is a better starting point than args = make_args(); fib(args) for the user, although that’s what you’d find in a programming language like FORTRAN, which is a very good programming language.

I also have many such highly parametrized functions that call each other, so I’d have fib3(;a=1,args...) = fib(args...) .+ a. In the other model, fib3 and fib would each have their own “configuration struct”, and the call fib(args...) would have to move data between the FIB3_ARGS struct, and the FIB_ARGS struct, or else the structs are nested in some way.

So I wanted to avoid the above, and I came here if there was a problem with that.

I understand. I actually prefer the kwargs approach semantically.

I cannot help with your last paragraph, but I can share something about

I find that fib() is a better starting point than args = make_args(); fib(args)

This was the main argument for me, why I wanted to go with the kwargs approach and there is a way to do this with structs as well. I am not going in details, since you are not going with this approach anyways, just as a general info (which was unbenownst to me).

You can set up fib() in such a way, that if there is no structure passed (e.g. the user didn’t write args = make_args(), it creates a struct with the defaults. Only if the users want to change something, then they do args = make_args(), fib(args).

What matters is you find an optimal approach for you.