Best way to document (a reasonable amount of) keywords

I am a bit unhappy with how I currently document keyword arguments in functions that I consider a bit more high-level.
Such a function should be usable easily also for people using a package for the first time.
Similar to example plots, my functions often come with several keyword arguments.

With this topic I would like to shortly document my current way of writing documentation for keywords, but ask for feedback and other ideas, see also questions at the end, to get an overview of other approaches as well.

Background / Artificial Example

Currently I do the following – illustrated on an artificial example in a doc string

    c = fun_name(a; kwargs... )
    c = fun_name(a, b; kwargs... )
    fun_name!(c, a, b; kwargs...)

This is a function that does this and that. It explains the function in a few – or a few more lines.

# Input

* `a` is the first parameter and explained here
* `b` is the second parameter and optional, defaults to a random call

# Keyword arguments

* `keyword_one`     : (`"default"`) does something
* `another_keyword` : (`200`) does something else

all further keywords are passed to the inner call of `second_function`.

# Output

thins function computes and returns `c`. 

So I usually also try to mention the in-place variant here as well, the in-place itself will have a shorter documentation, mainly referring to the one above.

Pros & Cons

What I do like on this form is that all function signatures are in a good few lines.

In REPL the above string even prints relatively nice, the spaces before the : and default values, align everything nice and give a good overview.
But for example in the docs the multiple spaces are of course (in HTML) shortened to one. That usually looks a bit crowded. See the above markdown rendered in the following

  • keyword_one : ("default") does something
  • another_keyword : (200) does something else

This gets a bit crowded for a reasonable amount of keywords. I was thinking maybe about a 3-column table, but have not yet tried this would render nice in REPL and HTML.

Concrete Example

Besides a missing kwargs... an example is Trust-Regions Solver ¡ Manopt.jl
where the doc string itself is Manopt.jl/src/solvers/trust_regions.jl at 342004e69427cfc859fe0ae0d32c4689db0ddee2 ¡ JuliaManifolds/Manopt.jl ¡ GitHub

Questions

  • How do you usually document keyword arguments?
  • How do you usually document default values of the keyword arguments?
  • Is there maybe an even better way to document when passing on the remaining keywords to another function?
1 Like

How do you usually document keyword arguments?

I would do it like this:

Do this and that (single line).

```julia
c = fun_name(
    a
    b=random();  # optional positional argument
    keyword_one="default",
    another_keyword=200
    kwargs...
)
```

This is a function that does this and that. It explains the function in a few – or a few more lines. Returns `c` as an instance of some type.

# Arguments

* `a` is the first parameter and explained here
* `b` is the second parameter and optional, defaults to a random call

# Keyword arguments

* `keyword_one`: does something
* `another_keyword`: does something else

All further keywords are passed to the inner call of `second_function`.

# See also

* [`fun_name!`](@ref) – the in-place version of this function

That is:

  • I usually have “Arguments” and “Keyword arguments” sections in my docstrings (unless it’s a very small function where I can mention all the arguments in the paragraph describing the function)
  • I document the default values primarily in the code block, not in the “Keyword arguments” section. Although I might still say something like “If true (default), do this…”.

This isn’t necessarily in line with the official recommendations. I’ve carried over the “introductory sentence” before the initial code block from Python, to keep the potential for an “autosummary” at some point in the future. One thing I insist on is that the code block should be “valid” for documenting the return type. I consider the fun_name(args) -> c style with the “fake” -> completely unacceptable. I always use fenced code blocks, not indented.

I don’t use the “summary line” in method docstrings, under the assumption that the methods are shown together with a function docstrings, where the summary is in the function docstrings. Actually, I tend to avoid method docstrings as much as possible: better to describe all methods in a single docstring. This means I also tend to avoid Documenter’s @autodocs, because that only renders method docstrings, not function docstrings. Instead, I used @docs blocks (potentially generated by a generate_api.jl script)

I don’t use DocStringExtensions: I don’t really trust the automatic signature extractions, although I agree that docstrings getting out of sync is a potential problem. This might be something to revisit at some point in the future.

I would also keep fun_name! separate (but potentially mention it in “See also”):

"""
```julia
fun_name!(c, a, b; kwargs...)
```

Like [`fun_name`](@ref), but acting in-place.
"""
function fun_name!(c, a, b; kwargs... )
    _c = fun_name(a, b; kwargs...)
    copyto!(c, _c)
end

See QuantumPropagators.init_prop for an “elaborate” example of my docstring style.

2 Likes

Thanks for our valuable feedback.

I do like the idea of documenting the defaults in the signature, it also makes them a “copyable” in case someone nearly wants to use the default but just change the number a bit. My two fears on that are

  • for more than – say – 20 keywords the code block gets a but longish
  • if I have 2 or three function signatures (see the example the trust regions I linked) I would like to only mention them once and would need a nice was to indicate that the others have the same kwargs

I totally agree here. I sometimes write y = f(x) in the code block of signatures if it makes sense to refer to the result y somewhere throughout the documenation – in the example above for example to mention f!(y,x).

That is an interesting standpoint. I think for me that depends on the application. For the cases I have in mind here, I agree, since the solvers I have in mind are one function with a few methods, but I think that depends a bit on the application / scenario: for example to the exp in ManifoldsBase.jl every method is an implementation on another manifold and documents its concrete formula while the generic function doc string explains the abstract generic concept.
I do agree in both cases however with voiding to use autodocs (if so restricted to source file names).

I do agree that one could move that part also to the see also section. I do agree to keep the fun_name! part short then. For now we do not yet do that in a very unified way but maybe it would be good to usually mode them to see-also – yes.

  • for more than – say – 20 keywords the code block gets a bit longish

I’d be okay with that. I often find code blocks easier to read than non-code text, so a longer code block doesn’t seem like a problem.

  • if I have 2 or three function signatures (see the example the trust regions I linked) I would like to only mention them once and would need a nice was to indicate that the others have the same kwargs

Yeah, I wouldn’t write them out more than once. You can still use kwargs... and describe things in context.

I’d probably write the trust_regions docstring something like this:


@doc raw"""
Run the Riemannian trust-region solver.

```julia
trust_regions(
    M, f, grad_f, hess_f, p=rand(M);
    acceptance_rate,  # mandatory keyword argument
    max_trust_region_radius, # mandatory keyword argument
    preconditioner, # mandatory keyword argument
    sub_stopping_criterion, # mandatory keyword argument
    trust_region_radius,  # mandatory keyword argument
    augmentation_threshold=0.75,
    augmentation_factor=2.0,
    evaluation=AllocatingEvaluation,
    project!=copyto!,
    randomize=false,
    ρ_regularization=1e3,
    reduction_factor=0.25,
    reduction_threshold=0.1,
    retraction=default_retraction_method(M, typeof(p)),
    stopping_criterion=StopAfterIteration,
    sub_kwargs=(),
    sub_problem=DefaultManoptProblem,
    sub_state=QuasiNewtonState,
)
```

runs the Riemannian trust-regions solver for optimization on manifolds to minimize `f`, see
on [AbsilBakerGallivan:2006, ConnGouldToint:2000](@cite).

Calling `trust_regions` as

```julia
trust_regions(M, f, grad_f, p=rand(M); kwargs...)
```

where no Hessian  (`hess_f`) is provided, the Hessian is computed using finite differences,
see [`ApproxHessianFiniteDifference`](@ref). This uses the same keyword arguments as with `hess_f`.

# Arguments

* `M`:      a manifold ``\mathcal M``
* `f`:      a cost function ``f : \mathcal M → ℝ`` to minimize
* `grad_f`: the gradient ``\operatorname{grad}F : \mathcal M → T \mathcal M`` of ``F``
* `Hess_f`: (optional), the Hessian ``\operatorname{Hess}F(x): T_x\mathcal M → T_x\mathcal M``, ``X ↦ \operatorname{Hess}F(x)[X] = ∇_ξ\operatorname{grad}f(x)``
* `p`:      (optional) An initial value ``x  ∈  \mathcal M``. Defaults to `rand(M)`

# Keyword arguments

* `acceptance_rate`: Accept/reject threshold: if ρ (the performance ratio for the iterate)
  is at least the acceptance rate ρ', the candidate is accepted.
  This value should  be between ``0`` and ``\frac{1}{4}``
…

There probably aren’t that many “mandatory keyword arguments”, but you get the picture.

Actually, I tend to avoid method docstrings as much as possible:

I might have been overstating that a bit. This is not a hard rule. Whatever makes sense for a particular use case!

3 Likes

I prefer to have keyword arguments in a separate section and I document default values in that section.

Example: LongestPaths.jl/src/longest_path.jl at 8851002ff7e2cb3142cebbef3062443e4c9af5b8 ¡ GunnarFarneback/LongestPaths.jl ¡ GitHub

1 Like

Interesting, I think I would have kept the first signature with kwargs... and the second with all keywords mentioned.
(or in general the last). That way one can still see all signatures in one place, but also all kwargs. One could also write the first signatures ending in kwargs_see_below... or so. Especially when I have 3 or 4 similar signatures, I would not like to put one that prominently upfront. But besides that, I think I like the way to put the default values by now.

That is not far away from my # Keyword arguments section. But how do you document the default values of the keywords then?
I usually have defaults from some books or that are known to experienced users; for example the default step size rule in gradient descent is Armijo linesearch. But I still want to document these defaults – for newcomers and for people like me who forget sometimes the magic default numbers.

1 Like

I gave this approach of documenting all keyword arguments in the signature a try.

Basically we have the list twice, once in the signature to list defaults (but the user does not yet know what these single keywords do) and in the #Keyword arguments with their description / explanation.

I think my main challenge here is, that in order to describe / explain all keyword arguments I list them again further down in the doc string, which made the one I tried super long and not so structured.
So maybe my mein criticism on this is, that the user would first read the signature, has to scroll through all that to get to the explanations, to then scroll up and learn about the default.

I think what I might try next is something similar to the keyword arguments in a separate section, where each explanation ends with a line stating something line “the default value is …”
Or I use a symbol / emoji to indicate the default at the end (instead of upfront in brackets, where I feel it might be confusing if you want to learn what the keyword does)

I experimented a bit at

even adopting some of the string ideas Gunnar mentioned. On REPL this looks quite nice, in the HTML I would like to just have a newline not a new paragraph before the default, since now it is a bit too close to the next keyword.
And sure, the symbol upfront is also just an experiment, one could write “default” as well, or write the keyword again as well /maybe a bit clumsy), or come up with a more reasonable symbol (ideas welcome).

I do it in a section like you. I write the default value immediatelly after the keyword name, in the markdown itemized list. These are the guidelines I provide for DynamicalSystems.jl: Contributor Guide · DynamicalSystems.jl . De-syncing is a possibility yes, but I don’t see any solution within the standard Julia language syntax. Perhaps the Makie team can tell us how they automatically extract keyword arguments of functions like lines in the docstrings? (cc @jules ?)

If I can give my two cents, this is something I would never do when writing a docstring:

image

A dedicated section is better and cleaner, and the most important information (the simplest call signature) should be what occupies the top of the docstring.

2 Likes

I also don’t really like writing out the entire function method as in @Datseris’s reply above for long signatures.

And sure, the symbol upfront is also just an experiment, one could write “default” as well, or write the keyword again as well /maybe a bit clumsy), or come up with a more reasonable symbol (ideas welcome).

Before reading this I couldn’t actually tell what that symbol was. At first it looked like some bug like it’s either a mistyped additional bullet point or it was meant to be a comment about the argument. I usually like to just write

kwarg = nothing: Description

as in e.g. Triangulations · DelaunayTriangulation.jl (aside: man the warnings on that docstring really clutter it… I should put those at the end or something probably). This matches what you’d see in the REPL for example which is nice.

1 Like

In Makie, these functions are defined using macros which populate functions storing the docstrings and default expressions that are then used to populate the docs via Documenter extension blocks. The examples are also dynamically spliced into the docs via those Documenter extensions. It’s a whole thing, not something you can easily copy for another package.

2 Likes

Yes, I tried writing out all keywords exactly once and already while writing I felt lost in that code block.
I think that might be fine for up to 5 keyword arguments.

Thanks for the guidelines, that is exactly what I am looking for. My way of documentation is more like “grown” into this over the last years and also not 100% unified, so I am looking for a good style before unifying it (and then hopefully sticking to that style.

Thanks, the kw = default is maybe even best, since it is close to what the user would use anyways. For longer texts I might also experiment with a new line afterwards. The emoji was really just a first idea where to maybe put the default.
While your approach is Julia syntax, this emoji was an idea to put more focus on the description and provide the default only after that, but well – I am not that happy with what I got out of that anyways.

While the Makie approach sounds great, it indeed also sounds like quite some qorj to get the whole thing started. While I do have quite a few keywords every now and then, I think I do not have as many as Makie has (and also needs) in the functions.

I think I like the form

* `keyword = defaultvalue`:
  description

With maybe two line breaks (a new paragraph in that item) if the line is more like a longer text.
Concerning defaults I will just define some strings that I collect in a central place if the description can be used for several keyword = occurrences throughout the docs. That way the description of a specific keyword would always be the same text (without having to copy it anywhere).

The only “disadvantage” is that to interpolate into a doc string (such defaults) it can now be raw which just means I also have to define formulae as (raw) strings beforehand (to easier type them) and interpolate them as well.
That might still be nice since one can reuse also usual formulae.

Thanks for all the examples and discussions :+1:

I went with an approach mixing a few things from here

  • the nice idea that defaults can basically be copied
  • that the doc string is a string that is added to (for me usually 2) different functions of a solver
  • reusing a few variable name explanations

an example (only of the forthcoming version) is

since this is a forthcoming version, linking the rendered string is a bit complicated, but I like the result and would like to thank everyone helping me to find that style!