Write documentation PRs not blog posts

Like the black formatter for Python. It is minimally configurable and very popular.

I think it gets its name from a quote attributed to Henry Ford: “the customer can have the car in any color they want, as long as that color is black.”

1 Like

I’m a fan of “The Grand Unified Theory of Documentation” described on Divio’s website:

7 Likes

I would just say “Write documentation PRs”. If you see a blog post and want that content in the documentation, just create a pull request to modify the documentation. For example,

1 Like

Note that if you copy/modify someone else’s blog post and create a documentation PR from it, it is likely that you are violating copyright law — please ask permission first!

14 Likes

That is a great breakdown of why different types of documentation are needed! Ideally, we would have all four covered and accessible for the most popular packages, maybe even broken into those groupings.

Taking plotting packages as an example, usually there is a

  • Tutorial → Tutorials
  • Gallery → How-To Guides
  • Manual → Reference
  • API → Reference

That is decent coverage except:

  • Little to no Explanation
  • References not comprehensive - not all methods and possible kwargs/values listed
  • Tutorials not comprehensive - usually just a handful covering the basics

On the other hand, the Pkg, VSCode, JuliaUp, Debugger, and other environment-type documentation is almost entirely Reference, which is why I think they need the most work.

1 Like

This is a good reason why I think it makes more sense for the authors to interact with the core documentation directly, rather than involving a middle man.

Related:

Two points on this.

  1. If the core developers of the package are also the authors of the blog post, a pull request can be a formal way to seek permission to reuse the text or content. If the blog author is a third party, they can also be tagged in the pull request and explicitly asked for permission.

I would be extremely transparent about where content originates from and who wrote it. It’s also possible to restate or paraphrase the content, although this must be done carefully.

  1. While the core developers are likely the most qualified to write the documentation, they probably have the least amount of time to write it. They are likely also the most qualified to write new code.

If there is something I can do to provide a developer with a one-click merge button to advance the state the project, I figure out how to do it. If cannot, I create an issue.

Adapting, John F. Kennedy’s phrasing “Ask not what open source core developers can do for you – ask what you can do for your open source core developers”.

Yes, I acknowledge the hurdles above, but I think there are ways to lower them. One tactic I have used is to solciit documetation help here on Discourse.

For example, see

2 Likes

Yes, I’m on board. I just wish the large number of users with a skill level between core_dev and me, that already understand the concepts and are already writing teaching material, would also help.

Maybe simplified GitHub navigation is the best thing to start documenting then.

The only thing I can contribute here is that I couldn’t possible agree more with the OP!!!

3 Likes

and that’s coming from someone whose package has (some of) the best documentation in the entire Julia ecosystem.

2 Likes

For real, for real! @Datseris got real good documentation skills! I was just watching his good scientific code related workshop today morning to learn to code better: Good Scientific Code Workshop - YouTube

2 Likes

I would say, the documentation section of the Good Scientific Workshop tries to teach a similar concept as what the OP asks for, but also what @GregVernon cited. In the workshop I talk about this in terms of “several layers of exposition depth”

I.e., that a good documentation is a mix of everything: summarizing concepts, teaching via tutorials, real world examples, and of course, the actual reference.

I can see now that an alternative and attractive way to say the same thing is via the “Grand Unified Theory of Documentation”. So Thanks @GregVernon for sharing this, I have just added one more slide to the workshop about this concept!

8 Likes

As one of the persons behind ModernJuliaWorkflows, I can only agree with @Nathan_Boyer that the ideal format would be additional documentation.
I will outline some of the reasons (more or less valid) that made me choose a blog post instead. Note that I am open to discussion and that nothing is set in stone yet.

Scale

With this project, we’re simultaneously trying to hit a very narrow target (only workflow tools, no domain-specific advice) and a very broad one (all the workflow tools ever created). The corresponding content in the Julia documentation is scattered across many different pages: getting started, documentation, performance tips, workflow tips, FAQ and a whole lot more. Updating all of this in a coherent and synchronous manner, without duplicates or contradiction, would result in a humongous PR, that no one would want to review, and it would take literal months.

Centralizing the content in a small blog allows a few writers (for now including @jacobusmmsmit, Adrian Hill and myself) to get it up and running much more easily, until we reach a coherent result. After that, nothing would make me happier than seeing people turn the content of the blog into documentation pages, perhaps in smaller chunks. Consider my permission granted to copy everything from there into the official docs (I’ll also add it to the blog front page).

Modernity

The claim “Modern Julia Workflows” can only hold true for a limited period of time. New packages emerge monthly, and keeping such a list up-to-date is, again, a huge endeavor that I don’t want to commit to.
Many parts in the Julia documentation are outdated but not marked as such. The upside of a blog post is that there is a timestamp: “this was the state of the art in the summer of 2023, but maybe things have changed since then”.

Recognition

I spend lots of time contributing to the Julia ecosystem, and as most of us know, it’s a thankless job. Especially in academia, where open source development is not regarded as worthy of our time or effort. So I must admit, having my name on a blog post (instead of an obscure docs PR) was a sweet and motivating prospect. That is definitely a questionable motive, but :person_shrugging:

14 Likes

I also want to add that documentation should be a MANAGED process. Julia is rapidly changing and there are many examples of Julia code, installation instructions, etc. that are outdated. This is not out of the ordinary. It would be really good if those contributors that create examples maintain and update those examples as time goes on. Eventually this will take less and less time as the language and various libraries become more mature.

The Julia manual (and most manuals for programming languages) are more in the Explanation / Reference part of that square. For tutorials and guides, I think those are better off in a separate place, similar to the split between the rust book (The Rust Programming Language - The Rust Programming Language) and the API documentation (std - Rust).

6 Likes

That is a fair perspective, but I think it is all about how you present your work. I do not agree that open source development is not regarded as worthy in today’s academia. It is not regarded as the only worth, but nowadays it is clear that having open source contributions matters positively in academia. Still not as positively as papers, but definitely positively.

I can offer a counter-argument that may make you reconsider:

It’s true that contributing to an existing piece of work means you are not contributing to your own name. However, by contributing to an existing piece, you become part of this piece. Let’s say that 10,000 users (which is thankfully by now a measurable number) use packages A,B,C and you have become a significant contributor to these three packages by doing lots of pull requests. You cannot claim that these packages are your work but you can claim that you have provided accessibility and improved them in a meaningful way. You can also prove collaboration skills as well in this way, as if you contribute often people get to know you and can vouch for you.

The alternative way is that you do not become part of these packages but you have your own blogpost. First, how do you establish the impact of this blogpost? I guess you could measure the readership, but it is not as concrete as package usage. But in any case, it is unlikely that the readership of a blogpost about a package would be higher than the users of the package itself, so in the numbers game you will be at a loss. In this path, people also get to know you, but it would be hard to vouch for you, because here the people that get to know you are front-end users, while in the previous scenario they are the developers, and therefore their “vouching” counts more when it comes to evaluation.

So, the question is, what would you value more in an imaginary application of your self to your self:

  • I have written my own blog posts
  • I am part of something larger

To put it in an academic allegory, which one would you value more:

  • I’ve written my own papers without co-authors but with few citations
  • I am part of papers with a handful of co-authors but with a lot of citations

You can argue for advantage of either, so I really think it is up to how you present your work. If I’d present documentation improvements I would say “I had a lasting impact on the quality and accessibility of a software by directly improving its documentation (which is the starting point for most users). This way I also learned to work with the core dev team.” which for me personally is a stronger statement than “I have a clear knowledge of a software and I can share this knowledge with others that may come across my blog”.

I think it is a good idea to add the same timestamps in the examples of a documentation, if you could foresee that they may break in the future. I haven’t done it so far, but now that you mention it, I definitely should. Or perhaps I could use Documenter.jl/Literate.jl to put an automatic stamp of when the code was run last.

5 Likes

That is interesting. I would also agree that tutorials should be separated from the reference/API, however, I am not sure I understand the argument of why they have to be in completely separate websites. The tutorials can be one page, the examples another page, the API a third page, all of a single documentation and all cross-linked with each other. @kristoffer.carlsson what’s your reasoning for preferring the separate places?

6 Likes

I don’t think they need to be on separate websites, but I do think the Rust community really benefits from having a place to point newbies that is solely focused on on ramping. There is no tab at the top that links or to technical API stuff, so there are fewer places for some to accidentally go out of their depth and feel overwhelmed.

Doesn’t have to be a different url persay, but the UI does matter.

2 Likes

You can access all the documentation from the manual or registered packages via JuliaHub.com:

https://juliahub.com/ui/Search?q=&type=docs

For on ramping newbies, we have Julia Academy:

1 Like

You make a strong case @Datseris, and you did convince me. From now on, my end goal will be to add this to the official documentation.

However, I think there are two separate challenges to distinguish here:

  1. Short term: writing the content
  2. Medium term: sharing the content

Designing a blog post is not necessarily the ultimate answer for 2, only the motivation I found for 1. Beyond the name recognition (which I agree is pretty shallow), there is also the feeling of seeing the results of your efforts in real time. It’s very pleasant to be able to write a few lines, commit and then see the website a few minutes later.
If I set out to contribute it all to the official docs from the start, this would translate into 20 issues and as many PRs open on different pages, with review delays anywhere between a week and a year. Knowing myself, I would probably lose interest after a while, and far less would get done in the short term.

I don’t know how other people think, but I believe this instant reward and quick iteration ability is among the main drivers of the ecosystem fragmentation. Except in this case it may not be so detrimental. After all, duplicating docs only makes it easier to find. And the task of copy-pasting and adapting an existing blog into the docs is much less daunting than writing it all from scratch.

To sum up, I propose to keep drafting the posts in a separate repo, so that they can take shape quickly. Then I would like to add them to the Julia blog, to make them easily discoverable in the short term. And in the medium term, I could coordinate efforts to transfer them to the docs, but importantly they would already be available to guide beginners in the meantime. How does that sound?

8 Likes