Non-friendly documentation

Did you read the Divio link that Tim posted already? If not, it’s really worth the time: https://documentation.divio.com/

It certainly convinced me that the goal of a “single documentation for everyone” is actually counter productive because there are several audiences with widely diverging needs.

4 Likes

I did look at it, and I understand that for some things it is a good approach, but I am not convinced it is the right one for packages in Julia, which is outstanding in its support for abstraction, orthogonality, and composability.

Most Julia packages have very small and clean APIs. When not, the marginal payoff from improving this is often larger than writing docs about the existing API (of course writing docs often leads to insights about improving the API).

1 Like

Your point is worth considering, but I’d also say that Julia’s composability sometimes makes it hard to understand where functionality comes from. People might look at the reference pages and say “that’s it? this doesn’t do anything!” without realizing that the package may only need to provide a few capabilities to interface with a larger ecosystem. In that case tutorial & demonstration material can be extremely helpful. But doing that well takes time. And it’s in quite a different category from reference material, which might be all that an experienced Julia developer might want.

10 Likes

The speaker is very persuasive. I like those ideas, and I will certainly try to learn from this talk.

One lesson from the talk that jumps out: it may be that a lot of Julia packages lack good tutorials.

I agree. I find I keep switching between the Reference and the Tutorial mindset when I do READMEs. Admittedly I probably need to get my packages into more stable modes first.

The documentation seems to be set more towards the Reference mindset though, while I believe Jupyter notebooks (or maybe Weave.jl?) would be needed for the latter? How would that work though, would we have to create seperate repos for tutorials/examples, or should they go into the packages themselves?

While we’re having this discussion, I do like the model Hadley Wickman has been implementing in his “Tidyverse” in R. These libraries usually have the regular R documentation that is a pain to read as a newcomer, but very useful once you learn what it means and how to use it, and also usually have a couple of vignettes. The vignettes are basically tutorials on very basic library use and are geared towards getting you up and running with the library. They are not exhaustive and there can be a few per library. I’m sure Hadley has a small army of people in RStudio doing this, so this might be too much for a regular Julia developer, but it’s something the is both technical, but beginner-friendly.

As a biologist who likes to code, I do suffer from the lack of documentation, but at the same time it taught me to read code more often. But I approached Julia with the willingness to learn, so I put the extra effort, I can see many users with just a passing interested being completely turned off by that.

And I’m sure it’s not a trivial problem to solve and it will take time and more ideas.

1 Like

Making it into a competition with scores etc. feels very American to me, and that so often has unintended consequences. It fits some people that like competitions and getting high scores, but there are just as many people who put in an effort and later see that they somehow got a low score for their effort and get very discouraged from that.

The same goes for KPIs, I’ve yet to see a useful set of KPIs anywhere. I find that KPIs are often terrible optimization objectives (even if they’re perhaps not supposed to be optimization objectives, they inevitable end up being). KPIs also discourage making “actually beneficial things” in favor of doing something that moves the KPI. In the open-source context, they may also discourage users first efforts if those produced a low KPI.

8 Likes

I agree. It is easy to measure quantity whereas it‘s impossible to measure quality (which is also very subjective). It also invites gaming the system for example by having docstrings for the sake of having docstrings:

"""
    Foo

The Foo struct.
"""
struct Foo end

An anecdote: I was recently following along with a Golang tutorial and the linter in VS Code was constantly complaining about missing docstrings for exports. As a result, I added nonsensical docstrings like the above. Then the linter complained that one of the types had a “stuttering name” at which point I was horribly annoyed and just stopped. Moral of the story: Shaming people into writing docs can have unintended side effects and feels like a bad idea.

2 Likes

I am not saying that tutorials (which border on being demos) aren’t useful, but I think that the main difficulty for newbies is that Julia requires a different mindset than most languages, and that is learned gradually and is hard to tie to a single package.

Many newcomers find Julia frustrating because they want to solve a particular problem, and zero in on a package, not realizing that they need to invest a bit more into the language and the ecosystem because a particular package just contributes a very specific set of building blocks, useful in a wider context.

For example, a lot of packages just have a few types that implement <: AbstractArray. For Julia users that have some experience with the language, that already clarifies a lot of things, and there is no need to repeat anything that is in the interface docs. A few examples and an explanation of the performance model is frequently sufficient (eg FillArrays.jl). But for newbies, a lot of the context is missing or implicit: someone created this particular type because realized it would be useful for some applications, but at the same time being a drop-in replacement for anything implementing the same interface.

I am particularly skeptical of the applicability of the “cookbook” (howto) documentation model in Julia. This model works very well for a lot of languages (eg R) because generally there are only a few recommended ways to do something, the package APIs are mostly stable, most of them are very high level functions encapsulating very complex functionality, generally hiding the details from the user.

In contrast, what most well-designed Julia packages provide are composable building blocks. There is usually a way of using them that the authors intended, but usually it is perfectly fine to use them in other ways, maybe not even foreseen by the people who wrote the code (a lot of issues are related to this, making the package even more composable as they are resolved).

Anyhow, I don’t have a silver bullet to recommend here. I just believe that best practices for documenting Julia packages will be evolving for some time because it is a novel problem. We can benefit from best practices of other communities, but may not be able to adapt them directly.

4 Likes

I can of course only speak for myself, but I think a common reason for lack of documentation is that a lot of packages are still in an early stage of their life cycle. The main user of the package during this stage is the developer himself/herself, and he/she does not need documentation.

Open source authors are often more motivated by getting publicity and users for their code than they are by other rewards, such as money. Therefore, something that I think would provide a huge incentive for writing at least a couple of pages of good documentation would be if the core Julia team created an “Introduction to Julia Packages” document. This would be prominently liked to from the home page, next to the manual, and contain introductions to a large (but curated) set of packages.

The text would be written by package authors themselves (or volunteers), and stored is a specified location in the package repository (e.g. /MyPackage/docs/src/introduction.md), where a bot would fetch it regularly and where it could also be part of the regular documentation of that package.

The only work done by the Julia team would be to decide which packages are included. The criteria for inclusion in the “Introduction to Julia Packages” would include:

  • A relatively stable API. (Not necessarily release 1.0, but also not weekly changes to the API.)
  • Relatively complete documentation.
  • A well-written introduction, which contains:
    • A comparison to other packages in the same area, explaining strengths and weaknesses of this package in comparison. (This is important, since the main point of the document is to help people decide which packages to give a closer look.)
    • Examples of common usage patterns
    • Examples of interaction with other packages (when applicable).

(The examples would be written in a way that allows automatic testing, and JuliaComputing could run a bot to file an issue with the package whenever an example stops working.)

It would be useful to have some meta-direction on the desire contents of the documentation.

There are at least 2 types of documentation:

  • short and concise for expert reference
  • Examples and tutorials for non-experts who are just learning it

The base documentation is along the lines of “short and concise”. Other examples would be “quick reference guides” and “cheat sheets”.

The Julia Manual has more exposition and examples (but no tutorials). It could be improved with more examples and some tutorials – but, then it can be annoying to scroll through hundreds of lines of examples and tutorials.

Yet … this is the web. This is where hyperlinks could shine. A link for “More Examples” or “Tutorial” for those who want to view it. When viewing the base documentation, when you hover your mouse over a function definition, a little blue button “source” appears.

Why not have a little blue button appear in the Manual for “More Examples” or “Tutorial”? I think you might get more users contributing to those areas. After I figure something out, I would likely contribute it as an example.

Often the developers are too skilled and knowledgeable to know what questions beginners will have.

Let me give two example I came across recently:

The documentation for Dataframes.jl has lots of examples of how you can view and subset the dataframe by rows, columns, or rows and columns. Really useful, exciting functionality. But … how do I reference a single cell?

Second example, CartesianIndex and CartesianIndices.
munch_scream_180

For that all is good and holy, it’s confusing. The cherry on top is the blog post by Tim Holy – and I don’t bring this up to pick on Tim, who I believe is commenting in this thread, and who has actually written documentation – but it has this statement:

These iterators are deceptively simple, so much so that I’ve never been entirely convinced that this blog post is necessary: once you learn a few principles, there’s almost nothing to it.

My reaction:
munch_scream_180

Then later:

You may already know that there are two recommended ways to iterate over the elements in an AbstractArray

Well, I don’t already know the two recommended ways, and what the hell is an AbstractArray?

A hyperlink could be handy for that last question.

Again, this is not to pick on Tim. I’m sure his documentation is fine for an a more advanced user. (Plus, Tim actually wrote documentation, which is what many of us want).

There just isn’t a “CartesianIndex For Dummies” document out there. Once I figure it out, I could write it (hmm… just what makes me qualified to write the “For Dummies” guide?).

1 Like

I think it reasonable to expect that users read the Julia manual. If not, at least they should be willing to search it.

I understand that some people don’t. This is fine, as long as they understand that this is their problem. Asking documentation writers to link back very basic building blocks of Julia to the manual is unreasonable.

IMO the right mindset is: don’t be a dummy. Invest in understanding things. It pays off.

Treat some parts of Julia like math. There is no shame in reading something five times to get an initial understanding, then coming back to it later to nail down everything.

1 Like

This is a good example of how hard it is to write to a wide audience. Thankfully, we no longer need to write manuals like they did 20 years ago. We can take that quote and make it this

You may already know that there are two recommended ways to iterate over the elements in an AbstractArray.

It does take some extra effort but if people read the documentation and tell use what they don’t understand then we can do that. I particularly like links and footnotes because they don’t get in the way of the core message for those that are comfortable with the topic at hand.

Maybe there could be a “necessary knowledge” list at the top of an article so people know what’s expected to get an article.

7 Likes

That is an interesting idea. I have seen this well executed in the documentation of SPICE. The docs for the individual functions, e.g. here, link back to more in-depth “required reading” articles, e.g. here.

Concerning the Julia manual, I think it shows that it has grown organically over the years (which is not a bad thing in itself). In my opinion, it is also much too dense (again not a bad thing in itself) to serve as an entry point for a beginner. What’s missing is something like the Swift Tour which gives a good feel for the language and is a lot less scary.

1 Like

…although it could be automated…

1 Like

If you want to understand things (and have a certain (finite) amount of time available. Then reading a “for dummies” type book is probably the best use of your time. If you know a lot, then it takes no time to skip over this little info-boxes that explain the basic concepts, but if there’s something that you’re unfamiliar with (or know by a different name) then those will save you a ton of time and effort.

2 Likes

That may be, but note that most “for dummies” books are comparable in length to the Julia manual (eg Cooking Basics for Dummies is 464 pages long, Guitar for Dummies 416 pages). No matter what you want to do, some initial investment is required.

Which gives me an idea… we should make website that rebrands the current manual (with unchanged content, just the title) as “Julia for Dummies”. People who wish to take the royal road would just read it instead.

5 Likes

I definitely agree that an initial investment is required, and I believe that people are prepared to make that investment. The problem is how to guide them to the right resource for their needs. When writing the documentation for a package, it’s impossible to know if the reader is an advanced Julia user, or somebody who never heard of Julia before. I also agree that it’s impossible for package authors to cater to the latter category. But maybe one can build automated tools to help.

The Julia manual is very good, but it is not quite on the “for dummies” level of explaining every new concept that is introduced.

The book Think Julia is much more in the “for dummies” direction (but couldn’t be marketed using that phrase due to trademark protection).

Maybe Documenter.jl could have an option to add a box at the bottom of every page with a glossary and links to relevant sections in the manual (or the docs of other packages, etc.), based on keywords that it finds on the page (or that the author provides).

4 Likes

@baggepinnen, I am still of the opinion that there must be a system for improving quality. And I do indeed find KPIs or peer review or curating of projects helpful. And if

It fits some people that like competitions and getting high scores,…

increases the quality of the developments/packages, that’s not bad.

If packages were developed to solve your own problems or are still WPI, they should only be clearly marked. And I also know that such opinions are not popular. :wink:

1 Like

Fair enough. That blog post was written in an era when many of us learned Julia primarily by reading base/ code; by the time it was posted, the new capabilities had been invading base/ for a while so there were a lot of examples to look at. But that comment definitely isn’t useful today. Deleted.

…and then just to be sure it shows both of them to you just a few words later. But it is indeed irrelevant whether people already know it or not, so I deleted it.

Good idea, done. Thanks for the feedback!

12 Likes