Being a relative newbie to Julia, I started looking into Documenter.jl with the idea in mind of adding or annexing a KWIC index builder. I think I’m going to do some more research on KWIC and some trial and error on processing an .md file into a sqlite database for building/maintaining a KWIC index.
A documentation style guide is a good idea I think. A simple set of ground rules to follow. Not everyone (myself included) really knows how to write good documentation. That link @tim.holy posted is great.
I don’t know if this is any easier task, but JOSS article may give the extra motivation needed for some package authors. For example, if you work in a University and you have some personal KPIs to create scientific publications. (Unfortunately, the journal classification system in Finland puts JOSS papers in the lowest quality group at the moment.)
I am not sure I understand how this would help get better documentation. Instead of writing docs because JOSS requires it, one might as well just write docs and be done with it. It’s a much less effort (consider the fixed cost of going though peer review, however benign) for a marginally lower payoff (in a lot of fields, it is a much better strategy to use a new algorithm in a paper, then ask users to cite that paper).
The bottom line is pretty much that docs get written when the authors feel that the project is mature enough to worth the investment.
All the Julia community can do is lower the barrier to entry by making docs generation and deployment as simple as possible. Thanks to the dedicated effort of the maintainers of Documenter.jl and related libraries, it is pretty low already (I would say 15–30 min the first time you set it up, then a couple of minutes per package after that, especially when using package template generators).
After more digging and investigating, I would have say that the Julia documentation is lacking in complete definitions and examples.
Specifically, I tripped across a statement in the split function that read “if r != 0:-1 …”. I have spent about 3 hours trying to find a reference that explains the “0:-1”. I have tried testing it in REPL, but it always seems to come back true.
Looking in the Punctuation section, it defines “a:b” as a binary infix operator to construct a range of a to b which make sense to me.
But where I get tripped up is the 0:-1! What does this mean?
If it means a range of 0 to -1, how does -1 play into to this? Otherwise what does it mean?
I am not looking for an answer here, just pointing out a newbie’s learning frustration at the missing documentation. (BTW: I am not a newbie to software: 50+ years, 30+ languages, 8 OS’es, systems, applications, communications programming, etc.)
IMO, good, thorough documentation makes everyone’s job easier.
And I, like many others, don’t like to do documentation.
While this discussion is still fresh in our minds I’d like to turn it into something productive. There are a lot of good ideas but I think most would agree an additional style guide would be helpful (perhaps not personally but at least for people new to documenting code.)
Should this be a new style guide from all others or would the Blue style guide be appropriate? what does the “Blue Crew” (@oxinabox@nickrobinson ) think of this?
I believe the question of “who is your audience?” is the most fundamental difficulty with open source documentation because the answer is “not the author of the software” (except perhaps for reference API docs). This is in sharp contrast to the actual software itself: the author usually writes software because they personally want to use it.
So I think it’s all about incentives: the average open source developer is strongly incentivized to write the software so they can use it. However, there’s no such incentive for documentation. The incentive for writing documentation comes from an entirely different desire for interaction and communication. (This can take various forms; a desire to help people, desire to be recognized for building a cool thing, the intellectual stimulation that comes from interacting with peers, a desire to make a difference. Probably many other things, it’s hard to summarize!)
All isn’t lost, because we wouldn’t be sharing our software with the world if we didn’t have some desire for interaction and communication. But I’d say that we’re most likely to write the docs we want to read (reference docs and explanation of the design). These are not the docs that many users really need (tutorials and how-to’s).
So I guess I’d like to ask: what is a strong incentive to write high quality tutorials and how-tos? Who will do it and how will they become involved? How will they know their audience?
I am not sure I fully agree with this. I don’t have hard data to talk about the average FOSS developer, but in scientific computing there are many incentives to making your packages nicely polished, which includes reasonable docs. Most importantly, you get more contributors that way, who discover and fix bugs, and add new features, frequently making your own work easier.
Also, I find it nice to document things for my own use: six month or a year later I may not remember all the details, so it is best to just write it down while it is fresh in my mind, at least in docstrings.
At the risk of being repetitive, I would like to suggest again that the incentives for documentation depend strongly on the life cycle of software. It’s not that authors do not want to document things in general, just that they have not come up with documentation for a particular package yet, for various reasons (the API is not stable, or the whole thing is an experiment and they will document it if it works out).
When users discover packages which are out in the open, maybe even registered, but lack documentation at this stage, it can simply mean that they are WIP. Some users are confused by this, especially if they come from languages where the central registry is curated to some extent (eg CRAN for R). It is better think of these packages as a sneak preview; standards we usually apply to polished packages are not (yet) applicable.
My comment about “no such incentive” was meant to be fairly restrictive: I mean to say there is no direct incentive for writing tutorials or how-tos because the core developers are not the audience for these things. Docs which help users to become contributors (or help developers with their old projects) are mostly the “quickstart how-to”, general devdocs (design explanation and comments) and reference documentation. These are largely not tutorials or how-tos either.
Yes I agree with this.
Mature projects need to value their new users and provide useful documentation for them if they want to successfully reach a wide audience. But I think it’s a shift in mindset to write such documentation that’s not what you need yourself. Personally I wish I was better at it
I would suggest that most, if not all, mature packages in Julia have some documentation. Some of them have very extensive docs, complete with tutorials and examples.
Even a lot of packages with pre-1.0 releases have very polished docs. I did not compile statistics, but it is my impression that most packages that have been around for 2 years and are actively developed have docs. It is one of the first things that get done when the API stabilizes.
I suspect that the complaint about the lack of docs is really a complaint about packages being WIP. But that’s a much harder problem to tackle than just writing up some docs.
You nailed it. To amplify it even further, let me invert @Tamas_Papp’s comment
there are many incentives to making your packages nicely polished, which includes reasonable docs…but lack documentation at this stage, it can simply mean that they are WIP
I would agree there are some incentives, but I’ve also experienced the converse: that I find myself avoiding or significantly delaying sharing WIP code that I know is already useful, mainly because I just can’t afford the time it would take to document it well or respond to the stream of issue reports I know it will generate.
Back to Chris’ points, even writing the reference docs is sometimes an afterthought, as strictly speaking you rarely require them while immersed in the details of writing the package. (I’m not saying that’s a good thing or that it always happens that way, and I agree with previous posts that writing docstrings almost invariably helps clarify the API design.) But the “you in six months” will likely appreciate the comments and docstrings, so most people learn fairly quickly to add at least a modicum of them during development.
In contrast, writing good tutorials is something that pays off under a much rarer set of conditions. For me, it’s the experience of having a lab full of people who are newer to coding (and for whom coding is often just one of a whole suite of skills they need) and watching them struggle, often with software I myself have written.
what is a strong incentive to write high quality tutorials and how-tos?
Docathans were mentioned above, but your observation
desire for interaction and communication
adds an important spin. Maybe the right answer is that rather than prioritizing hackathons, we should focus much more social energy on docathons. The participation, visibility, and social interaction would be at least a measure of reward. We could have signups where people who want to improve documentation for other packages announce their intentions, and then there could be some formal process of inviting the main developer(s) to consult.
I often feel like I’m not the right person to write tutorials for my own packages (being too immersed in the detail to know what new users need). But if not myself, surely I couldn’t ask someone else to do so in their spare time!
On the other hand, there’s another incentive I’ve missed out above: the desire to learn about cool things other people are doing.
What if a doc/hack-athon was structured such that package developers sign up not to document their own package, but to document someone else’s and receive similar documentation of their own packages as a contribution in kind? Ideally it would be a package they only know very little about which forces them into new-user mode. The primary output could be a working tutorial or how-to rather than a piece of code?
One of the things I learned in high school English was that any written document needs to be written to a specific reader.
I’m thinking there are three classes of readers for software documentation:
The newbie who needs simple explanations with simple examples to get started. The documentation probably should take the reader from an introduction of a topic through a better understanding of the topic.
The developer who needs technical reference with more complex examples. This person has a problem or need, knows most of what’s going on and wants to be able to quickly look something up and get an answer.
Then there’s the developer who needs/wants to know exactly how something works internally. This could be for performance reasons, enhancement reasons, personal growth/curiosity, or some other reason.
IMO, each targeted reader needs a different document, written in a different style.
BTW: I am not a big fan of tutorial videos. They usually take too long to get to the point that I am interested in.
From Dr. Vannebar Bush’s article “As We May Think”:
“…He (man) has built a civilization so complex that he needs to mechanize his records more fully if he is to push his experiment to its logical conclusion and not merely become bogged down part way by over taxing his limited memory. His excursions may be more enjoyable if he can reacquire the privilege of forgetting the manifold things he does not need to have immediately at hand, with some assurance that he can find them again if they prove important.”
I think the appropriate thought herein is “have immediately at hand, with some assurance that he can find them again”.
A (somewhat) unique feature of Julia (actually, parametric multiple dispatch, but that is only available in Julia at the moment) is being able to define very clean and orthogonal APIs, made up from a small set of functions and types, that are composable within the package, with the API of other packages, and Base.
Instead of writing various kinds of documentation, making the API really nice and idiomatic instead could have higher payoffs. This is an ideal that is not always attainable, but when it is approached it may be sufficient to just have a single documentation for everyone.
I am not sure the internals need to be documented beyond comments/docstrings. Maybe for very complex packages that do something very tricky.
I did look at it, and I understand that for some things it is a good approach, but I am not convinced it is the right one for packages in Julia, which is outstanding in its support for abstraction, orthogonality, and composability.
Most Julia packages have very small and clean APIs. When not, the marginal payoff from improving this is often larger than writing docs about the existing API (of course writing docs often leads to insights about improving the API).
Your point is worth considering, but I’d also say that Julia’s composability sometimes makes it hard to understand where functionality comes from. People might look at the reference pages and say “that’s it? this doesn’t do anything!” without realizing that the package may only need to provide a few capabilities to interface with a larger ecosystem. In that case tutorial & demonstration material can be extremely helpful. But doing that well takes time. And it’s in quite a different category from reference material, which might be all that an experienced Julia developer might want.