Best way to document (a reasonable amount of) keywords

I am a bit unhappy with how I currently document keyword arguments in functions that I consider a bit more high-level.
Such a function should be usable easily also for people using a package for the first time.
Similar to example plots, my functions often come with several keyword arguments.

With this topic I would like to shortly document my current way of writing documentation for keywords, but ask for feedback and other ideas, see also questions at the end, to get an overview of other approaches as well.

Background / Artificial Example

Currently I do the following – illustrated on an artificial example in a doc string

    c = fun_name(a; kwargs... )
    c = fun_name(a, b; kwargs... )
    fun_name!(c, a, b; kwargs...)

This is a function that does this and that. It explains the function in a few – or a few more lines.

# Input

* `a` is the first parameter and explained here
* `b` is the second parameter and optional, defaults to a random call

# Keyword arguments

* `keyword_one`     : (`"default"`) does something
* `another_keyword` : (`200`) does something else

all further keywords are passed to the inner call of `second_function`.

# Output

thins function computes and returns `c`. 

So I usually also try to mention the in-place variant here as well, the in-place itself will have a shorter documentation, mainly referring to the one above.

Pros & Cons

What I do like on this form is that all function signatures are in a good few lines.

In REPL the above string even prints relatively nice, the spaces before the : and default values, align everything nice and give a good overview.
But for example in the docs the multiple spaces are of course (in HTML) shortened to one. That usually looks a bit crowded. See the above markdown rendered in the following

  • keyword_one : ("default") does something
  • another_keyword : (200) does something else

This gets a bit crowded for a reasonable amount of keywords. I was thinking maybe about a 3-column table, but have not yet tried this would render nice in REPL and HTML.

Concrete Example

Besides a missing kwargs... an example is Trust-Regions Solver · Manopt.jl
where the doc string itself is Manopt.jl/src/solvers/trust_regions.jl at 342004e69427cfc859fe0ae0d32c4689db0ddee2 · JuliaManifolds/Manopt.jl · GitHub

Questions

  • How do you usually document keyword arguments?
  • How do you usually document default values of the keyword arguments?
  • Is there maybe an even better way to document when passing on the remaining keywords to another function?

How do you usually document keyword arguments?

I would do it like this:

Do this and that (single line).

```julia
c = fun_name(
    a
    b=random();  # optional positional argument
    keyword_one="default",
    another_keyword=200
    kwargs...
)
```

This is a function that does this and that. It explains the function in a few – or a few more lines. Returns `c` as an instance of some type.

# Arguments

* `a` is the first parameter and explained here
* `b` is the second parameter and optional, defaults to a random call

# Keyword arguments

* `keyword_one`: does something
* `another_keyword`: does something else

All further keywords are passed to the inner call of `second_function`.

# See also

* [`fun_name!`](@ref) – the in-place version of this function

That is:

  • I usually have “Arguments” and “Keyword arguments” sections in my docstrings (unless it’s a very small function where I can mention all the arguments in the paragraph describing the function)
  • I document the default values primarily in the code block, not in the “Keyword arguments” section. Although I might still say something like “If true (default), do this…”.

This isn’t necessarily in line with the official recommendations. I’ve carried over the “introductory sentence” before the initial code block from Python, to keep the potential for an “autosummary” at some point in the future. One thing I insist on is that the code block should be “valid” for documenting the return type. I consider the fun_name(args) -> c style with the “fake” -> completely unacceptable. I always use fenced code blocks, not indented.

I don’t use the “summary line” in method docstrings, under the assumption that the methods are shown together with a function docstrings, where the summary is in the function docstrings. Actually, I tend to avoid method docstrings as much as possible: better to describe all methods in a single docstring. This means I also tend to avoid Documenter’s @autodocs, because that only renders method docstrings, not function docstrings. Instead, I used @docs blocks (potentially generated by a generate_api.jl script)

I don’t use DocStringExtensions: I don’t really trust the automatic signature extractions, although I agree that docstrings getting out of sync is a potential problem. This might be something to revisit at some point in the future.

I would also keep fun_name! separate (but potentially mention it in “See also”):

"""
```julia
fun_name!(c, a, b; kwargs...)
```

Like [`fun_name`](@ref), but acting in-place.
"""
function fun_name!(c, a, b; kwargs... )
    _c = fun_name(a, b; kwargs...)
    copyto!(c, _c)
end

See QuantumPropagators.init_prop for an “elaborate” example of my docstring style.

2 Likes

Thanks for our valuable feedback.

I do like the idea of documenting the defaults in the signature, it also makes them a “copyable” in case someone nearly wants to use the default but just change the number a bit. My two fears on that are

  • for more than – say – 20 keywords the code block gets a but longish
  • if I have 2 or three function signatures (see the example the trust regions I linked) I would like to only mention them once and would need a nice was to indicate that the others have the same kwargs

I totally agree here. I sometimes write y = f(x) in the code block of signatures if it makes sense to refer to the result y somewhere throughout the documenation – in the example above for example to mention f!(y,x).

That is an interesting standpoint. I think for me that depends on the application. For the cases I have in mind here, I agree, since the solvers I have in mind are one function with a few methods, but I think that depends a bit on the application / scenario: for example to the exp in ManifoldsBase.jl every method is an implementation on another manifold and documents its concrete formula while the generic function doc string explains the abstract generic concept.
I do agree in both cases however with voiding to use autodocs (if so restricted to source file names).

I do agree that one could move that part also to the see also section. I do agree to keep the fun_name! part short then. For now we do not yet do that in a very unified way but maybe it would be good to usually mode them to see-also – yes.

  • for more than – say – 20 keywords the code block gets a bit longish

I’d be okay with that. I often find code blocks easier to read than non-code text, so a longer code block doesn’t seem like a problem.

  • if I have 2 or three function signatures (see the example the trust regions I linked) I would like to only mention them once and would need a nice was to indicate that the others have the same kwargs

Yeah, I wouldn’t write them out more than once. You can still use kwargs... and describe things in context.

I’d probably write the trust_regions docstring something like this:


@doc raw"""
Run the Riemannian trust-region solver.

```julia
trust_regions(
    M, f, grad_f, hess_f, p=rand(M);
    acceptance_rate,  # mandatory keyword argument
    max_trust_region_radius, # mandatory keyword argument
    preconditioner, # mandatory keyword argument
    sub_stopping_criterion, # mandatory keyword argument
    trust_region_radius,  # mandatory keyword argument
    augmentation_threshold=0.75,
    augmentation_factor=2.0,
    evaluation=AllocatingEvaluation,
    project!=copyto!,
    randomize=false,
    ρ_regularization=1e3,
    reduction_factor=0.25,
    reduction_threshold=0.1,
    retraction=default_retraction_method(M, typeof(p)),
    stopping_criterion=StopAfterIteration,
    sub_kwargs=(),
    sub_problem=DefaultManoptProblem,
    sub_state=QuasiNewtonState,
)
```

runs the Riemannian trust-regions solver for optimization on manifolds to minimize `f`, see
on [AbsilBakerGallivan:2006, ConnGouldToint:2000](@cite).

Calling `trust_regions` as

```julia
trust_regions(M, f, grad_f, p=rand(M); kwargs...)
```

where no Hessian  (`hess_f`) is provided, the Hessian is computed using finite differences,
see [`ApproxHessianFiniteDifference`](@ref). This uses the same keyword arguments as with `hess_f`.

# Arguments

* `M`:      a manifold ``\mathcal M``
* `f`:      a cost function ``f : \mathcal M → ℝ`` to minimize
* `grad_f`: the gradient ``\operatorname{grad}F : \mathcal M → T \mathcal M`` of ``F``
* `Hess_f`: (optional), the Hessian ``\operatorname{Hess}F(x): T_x\mathcal M → T_x\mathcal M``, ``X ↦ \operatorname{Hess}F(x)[X] = ∇_ξ\operatorname{grad}f(x)``
* `p`:      (optional) An initial value ``x  ∈  \mathcal M``. Defaults to `rand(M)`

# Keyword arguments

* `acceptance_rate`: Accept/reject threshold: if ρ (the performance ratio for the iterate)
  is at least the acceptance rate ρ', the candidate is accepted.
  This value should  be between ``0`` and ``\frac{1}{4}``
…

There probably aren’t that many “mandatory keyword arguments”, but you get the picture.

Actually, I tend to avoid method docstrings as much as possible:

I might have been overstating that a bit. This is not a hard rule. Whatever makes sense for a particular use case!

1 Like

I prefer to have keyword arguments in a separate section and I document default values in that section.

Example: LongestPaths.jl/src/longest_path.jl at 8851002ff7e2cb3142cebbef3062443e4c9af5b8 · GunnarFarneback/LongestPaths.jl · GitHub

1 Like

Interesting, I think I would have kept the first signature with kwargs... and the second with all keywords mentioned.
(or in general the last). That way one can still see all signatures in one place, but also all kwargs. One could also write the first signatures ending in kwargs_see_below... or so. Especially when I have 3 or 4 similar signatures, I would not like to put one that prominently upfront. But besides that, I think I like the way to put the default values by now.

That is not far away from my # Keyword arguments section. But how do you document the default values of the keywords then?
I usually have defaults from some books or that are known to experienced users; for example the default step size rule in gradient descent is Armijo linesearch. But I still want to document these defaults – for newcomers and for people like me who forget sometimes the magic default numbers.