Write documentation PRs not blog posts

I will start by saying I have read many useful blog posts and course notes that have been linked on this forum, and I greatly appreciate the effort that goes into making them. The problem is they are not discoverable. It is my opinion that these guides would be much better served as a “Tutorial” or “Usage Guide” section in the official documentation for each package. I would like to have a discussion about that stance and what we can do to improve discoverability of fundamental knowledge for new users to enjoy the Julia experience faster.

It is well known (based on questions that come up on this forum) that documentation for even some of our most fundamental packages is not sufficient to get a new user comfortable using Julia efficiently: Pkg, VSCode, JuliaUp, Revise, Debugger, workflow, reading errors, etc. To me, writing Julia is easy, but running and engineering Julia is hard. Even after 3 years of casual Julia use, I don’t really understand these fundamentals.

Subscribing to your favorite Julia blogger is a good way to get niche tips once you are already in, but it is not a good way to start learning from scratch. I am really excited to see Modern Julia Workflows come together, mostly because it will point new Julia users to the best starting tools, but is anyone really going to find it a year from now buried here:

For me at least, if a sufficiently thorough and high-quality “User Guide” is not within two clicks of the GitHub README, I start to get frustrated digging around the internet trying to get something working. I think more of the great writing coming out of the community should be added directly as pages or links in the documentation. Then it is much easier to find and point people to the answers they are looking for. I just want the great content I see being linked on discourse more broadly available to new users who aren’t necessarily plugged in to the Julia community, and I am hoping to start a movement and discussion in that direction.

Some Obvious Retorts:

  • Documentation is for reference not long-form instruction!
    Both can coexist (on the same website).

  • Blogs are opinionated. Documentation can’t be: requires public consensus!
    Who cares if a method different from the one you personally use is the officially documented one? You could add an “alternative workflow” page or make the documentation a list of links to several differing external resources. Just give people an easy and popular option to get them started. An opinionated guide is better than no guide!

  • Blogs can discuss a broader ecosystem. Documentation must have a focused scope!
    I will admit that the broadest cases (something like “Machine Learning with Statistical Datasets”) is better as a separate blog or course, but common and simple use cases could benefit from having a dedicated page or even separate documentation (linked to from both READMEs) on “Using SomePackage with OtherPackage”.

  • If you are so unhappy with the documentation, then fix it yourself!
    – I try here and there, but I am only one man. There is a lot of documentation to improve.
    – There are hurdles that sometimes prevent a novice like me from contributing.
    (I’m not sure that the remedy for the first three topics in that link ever did find a good home.)
    – The end result would be much better if the whole community was committed to generating great content in these few focused, accessible locations together.

24 Likes

In fact, I often observe that some of the most popular frameworks / style guides, etc. etc. are highly opinionated!

5 Likes

Like the black formatter for Python. It is minimally configurable and very popular.

I think it gets its name from a quote attributed to Henry Ford: “the customer can have the car in any color they want, as long as that color is black.”

1 Like

I’m a fan of “The Grand Unified Theory of Documentation” described on Divio’s website:

8 Likes

I would just say “Write documentation PRs”. If you see a blog post and want that content in the documentation, just create a pull request to modify the documentation. For example,

1 Like

Note that if you copy/modify someone else’s blog post and create a documentation PR from it, it is likely that you are violating copyright law — please ask permission first!

14 Likes

That is a great breakdown of why different types of documentation are needed! Ideally, we would have all four covered and accessible for the most popular packages, maybe even broken into those groupings.

Taking plotting packages as an example, usually there is a

  • Tutorial → Tutorials
  • Gallery → How-To Guides
  • Manual → Reference
  • API → Reference

That is decent coverage except:

  • Little to no Explanation
  • References not comprehensive - not all methods and possible kwargs/values listed
  • Tutorials not comprehensive - usually just a handful covering the basics

On the other hand, the Pkg, VSCode, JuliaUp, Debugger, and other environment-type documentation is almost entirely Reference, which is why I think they need the most work.

1 Like

This is a good reason why I think it makes more sense for the authors to interact with the core documentation directly, rather than involving a middle man.

Related:

Two points on this.

  1. If the core developers of the package are also the authors of the blog post, a pull request can be a formal way to seek permission to reuse the text or content. If the blog author is a third party, they can also be tagged in the pull request and explicitly asked for permission.

I would be extremely transparent about where content originates from and who wrote it. It’s also possible to restate or paraphrase the content, although this must be done carefully.

  1. While the core developers are likely the most qualified to write the documentation, they probably have the least amount of time to write it. They are likely also the most qualified to write new code.

If there is something I can do to provide a developer with a one-click merge button to advance the state the project, I figure out how to do it. If cannot, I create an issue.

Adapting, John F. Kennedy’s phrasing “Ask not what open source core developers can do for you – ask what you can do for your open source core developers”.

Yes, I acknowledge the hurdles above, but I think there are ways to lower them. One tactic I have used is to solciit documetation help here on Discourse.

For example, see

2 Likes

Yes, I’m on board. I just wish the large number of users with a skill level between core_dev and me, that already understand the concepts and are already writing teaching material, would also help.

Maybe simplified GitHub navigation is the best thing to start documenting then.

The only thing I can contribute here is that I couldn’t possible agree more with the OP!!!

3 Likes

and that’s coming from someone whose package has (some of) the best documentation in the entire Julia ecosystem.

2 Likes

For real, for real! @Datseris got real good documentation skills! I was just watching his good scientific code related workshop today morning to learn to code better: Good Scientific Code Workshop - YouTube

2 Likes

I would say, the documentation section of the Good Scientific Workshop tries to teach a similar concept as what the OP asks for, but also what @GregVernon cited. In the workshop I talk about this in terms of “several layers of exposition depth”

I.e., that a good documentation is a mix of everything: summarizing concepts, teaching via tutorials, real world examples, and of course, the actual reference.

I can see now that an alternative and attractive way to say the same thing is via the “Grand Unified Theory of Documentation”. So Thanks @GregVernon for sharing this, I have just added one more slide to the workshop about this concept!

8 Likes

As one of the persons behind ModernJuliaWorkflows, I can only agree with @Nathan_Boyer that the ideal format would be additional documentation.
I will outline some of the reasons (more or less valid) that made me choose a blog post instead. Note that I am open to discussion and that nothing is set in stone yet.

Scale

With this project, we’re simultaneously trying to hit a very narrow target (only workflow tools, no domain-specific advice) and a very broad one (all the workflow tools ever created). The corresponding content in the Julia documentation is scattered across many different pages: getting started, documentation, performance tips, workflow tips, FAQ and a whole lot more. Updating all of this in a coherent and synchronous manner, without duplicates or contradiction, would result in a humongous PR, that no one would want to review, and it would take literal months.

Centralizing the content in a small blog allows a few writers (for now including @jacobusmmsmit, Adrian Hill and myself) to get it up and running much more easily, until we reach a coherent result. After that, nothing would make me happier than seeing people turn the content of the blog into documentation pages, perhaps in smaller chunks. Consider my permission granted to copy everything from there into the official docs (I’ll also add it to the blog front page).

Modernity

The claim “Modern Julia Workflows” can only hold true for a limited period of time. New packages emerge monthly, and keeping such a list up-to-date is, again, a huge endeavor that I don’t want to commit to.
Many parts in the Julia documentation are outdated but not marked as such. The upside of a blog post is that there is a timestamp: “this was the state of the art in the summer of 2023, but maybe things have changed since then”.

Recognition

I spend lots of time contributing to the Julia ecosystem, and as most of us know, it’s a thankless job. Especially in academia, where open source development is not regarded as worthy of our time or effort. So I must admit, having my name on a blog post (instead of an obscure docs PR) was a sweet and motivating prospect. That is definitely a questionable motive, but :person_shrugging:

14 Likes

I also want to add that documentation should be a MANAGED process. Julia is rapidly changing and there are many examples of Julia code, installation instructions, etc. that are outdated. This is not out of the ordinary. It would be really good if those contributors that create examples maintain and update those examples as time goes on. Eventually this will take less and less time as the language and various libraries become more mature.

The Julia manual (and most manuals for programming languages) are more in the Explanation / Reference part of that square. For tutorials and guides, I think those are better off in a separate place, similar to the split between the rust book (The Rust Programming Language - The Rust Programming Language) and the API documentation (std - Rust).

7 Likes

That is a fair perspective, but I think it is all about how you present your work. I do not agree that open source development is not regarded as worthy in today’s academia. It is not regarded as the only worth, but nowadays it is clear that having open source contributions matters positively in academia. Still not as positively as papers, but definitely positively.

I can offer a counter-argument that may make you reconsider:

It’s true that contributing to an existing piece of work means you are not contributing to your own name. However, by contributing to an existing piece, you become part of this piece. Let’s say that 10,000 users (which is thankfully by now a measurable number) use packages A,B,C and you have become a significant contributor to these three packages by doing lots of pull requests. You cannot claim that these packages are your work but you can claim that you have provided accessibility and improved them in a meaningful way. You can also prove collaboration skills as well in this way, as if you contribute often people get to know you and can vouch for you.

The alternative way is that you do not become part of these packages but you have your own blogpost. First, how do you establish the impact of this blogpost? I guess you could measure the readership, but it is not as concrete as package usage. But in any case, it is unlikely that the readership of a blogpost about a package would be higher than the users of the package itself, so in the numbers game you will be at a loss. In this path, people also get to know you, but it would be hard to vouch for you, because here the people that get to know you are front-end users, while in the previous scenario they are the developers, and therefore their “vouching” counts more when it comes to evaluation.

So, the question is, what would you value more in an imaginary application of your self to your self:

  • I have written my own blog posts
  • I am part of something larger

To put it in an academic allegory, which one would you value more:

  • I’ve written my own papers without co-authors but with few citations
  • I am part of papers with a handful of co-authors but with a lot of citations

You can argue for advantage of either, so I really think it is up to how you present your work. If I’d present documentation improvements I would say “I had a lasting impact on the quality and accessibility of a software by directly improving its documentation (which is the starting point for most users). This way I also learned to work with the core dev team.” which for me personally is a stronger statement than “I have a clear knowledge of a software and I can share this knowledge with others that may come across my blog”.

I think it is a good idea to add the same timestamps in the examples of a documentation, if you could foresee that they may break in the future. I haven’t done it so far, but now that you mention it, I definitely should. Or perhaps I could use Documenter.jl/Literate.jl to put an automatic stamp of when the code was run last.

5 Likes

That is interesting. I would also agree that tutorials should be separated from the reference/API, however, I am not sure I understand the argument of why they have to be in completely separate websites. The tutorials can be one page, the examples another page, the API a third page, all of a single documentation and all cross-linked with each other. @kristoffer.carlsson what’s your reasoning for preferring the separate places?

6 Likes

I don’t think they need to be on separate websites, but I do think the Rust community really benefits from having a place to point newbies that is solely focused on on ramping. There is no tab at the top that links or to technical API stuff, so there are fewer places for some to accidentally go out of their depth and feel overwhelmed.

Doesn’t have to be a different url persay, but the UI does matter.

2 Likes