Suggestion to slightly improve Julia development

That may indeed be the best way to handle documentation from a quality standpoint. I have no real gripes. I just wanted to support the initial premise that novice contributors may find the slow process discouraging.

I am not quite sure what you are talking about. The first PR was reviewed the same day and merged in 6 days. Also, you got a warm welcome and quite a few people expressed satisfaction with your PR with :heart: emojis.

In what sense was this slow or discouraging?

1 Like

Don’t get me wrong; the initial response was wonderful and very encouraging. (The second one not so much, since it wasn’t linked in a Discourse thread like the first.) I am talking about the time it takes before the general audience gets to see your contributions. It will be more than half a year before other newbies looking for the difference between a single and double arrow have a quick way to find that. This is in contrast to the wiki model brought up earlier by @Oscar_Smith, where others have access to your changes immediately. I wonder if one of @Ronis_BR’s suggestions would strike a balance between speed and quality.

3 Likes

And the second is opened for more than 2 months for a documentation change.

You may think that everything is normal. However, at least 6 people said that Julia community has problems when a newcomer wants to start contributing. The #my-first-pr channel in Slack seems completely empty. I know a very good developer that just stopped contributing to Julia for similar reasons.

Eternal denial does not help. I am pretty sure you do not think that the current model is perfect, do you? We are trying to discuss minor improvements to make things better, not an entirely new development model. I proposed a workflow that you said it would not work, but apparently, for reasons I did not say.

That’s what I hope. Let’s wait for some feedback about that proposal.

3 Likes

I am under the impression that open PRs like this are revisited on a regular basis, especially before releases. If it gets merged before the release, I fail to see what the problem is.

I am sure that certain workflow changes — especially solution adopted by Rust pointed out above — could improve contributor experience, but by and large I think that response times are reasonable for a project of this complexity. As an occasional contributor to Julia (Base/standard libs), I find it well managed in the context of comparable FOSS projects and so far my experience has been fairly positive.

Incidentally, I think that Eternal Denial would be a great name for a rock band.

4 Likes

Docs, maybe, but there are cases of minor things that took almost a year. There are other cases in which the user just closed the PR and forgot about it.

Given my contributions to other open-source projects, I sometimes find that it can be better. Nobody can ask anything from core devs. They are extremely busy because Julia has amazing features to be implemented. What I am proposing is a safe and good way for the community to help them. What’s wrong with that?

Indeed! I play guitar. Maybe we can start a rock band using video calls? :smiley:

4 Likes

This is a curse, not a blessing though. With packages, you can use the lovely package manager to declare compatibility, you can use Manifests to lock package versions in place, you can yank bad releases, etc. With Julia, you can’t and unfortunately, for now, this also applies to the standard libraries.

I also think that moving things out of Julia will be a big help for the claimed issue here because it will move more development out of the Julia repo into repos with higher velocity and would allow for more focused PRs in the Julia repo itself.

12 Likes

We don’t backport documentation PRs because backporting changes is a lot of work and adding doc PRs to that workload would make the work situation that much worse. If there could be a community effort to backport doc PRs to stable branches as appropriate then perhaps that could be considered. However, my experience is that backporting and testing release branches is tedious, painstaking work that doesn’t seem super well suited to crowdsourcing. Perhaps there is some approach that would make it easier.

3 Likes

Cxx is a very special case because it’s so deeply entangled in Julia’s internals, which is frankly a major design problem with the library from the start. When the compiler internals change, Cxx inevitably needs to be updated too. Since Keno doesn’t have time to update Cxx and no one else is willing/able to do so, it doesn’t get updated. The solution would be to introduce stable compiler APIs on top of which Cxx could be built and then test that API and not break it. However, that again, is something that requires someone roughly Keno-equivalent to do. I’m not sure why Cxx was chosen as an example here — most packages don’t need to use compiler internals and can be implemented using Julia’s public API, which does not break.

8 Likes

I am sorry for intervening, but this is actually an interesting question: 1 year waiting, is it an outlier or not? I took the liberty of doing some data exploration. All scripts and data can be found in https://github.com/Arkoniak/JuliaRepoReport.jl, report itself is located at ⚡ Pluto.jl ⚡. There is a raw data in jlgh.csv if someone wants to explore it and dig out something else.

As a sanity check here is some overall statistics regarding julia repo (on 2020-11-28):

  • Number of PRs: 19603
  • Number of merged PRs: 15610
  • Number of closed (non merged PRs): 3137
  • Number of mergers (those who can accept and merge PR): 65
  • Number of PR authors: 1400

Maximum number of PRs was created during 2017 - 3416 (~285 PR per month), currently it is 2466 (~224 per month), so activity is somewhat lower than it was at the pick, but actually it is larger than it was in 2019.

Top author is Jeff Bezanson with 1569 PRs (~8% of all PRs) and top merger (I do not know correct name, maybe repo maintainer?) is Jeff Bezanson with 3343 merged PRs (~21% of all merged PRs).

Now for the sad part. After low 2019, the number of PRs is growing again, as more or less number of PR authors. On the other hand, maintainers number is basically not growing, only 5 new maintainers were added during the last 3 years (out of 65). In 2020 we have 30 active mainteiners with 82 PRs per maintainer per year. It’s ~1.5 PRs per week, but of course PRs distributed very nonuniformely between maintainers, some do few PRs per year and others do the rest.

New authors rarely return, during the last three years more than a half authors made one PR.

Waiting times are growing steadily. If in 2014 it took less then 22 hours for half of PRs to be closed, now this number is 70 hours.

And for the original question itself: in 2019 more than 12% of PRs were closed after 1 year of waiting or were not closed at all. I think that this number is too big to be considered an outlier.

Maybe I’ve made a mistake somewhere, and of course, it is a rather general overview, since I didn’t take into account quality of PRs, fact that they can be reopened or duplicated, yada yada and so on and so on, but yet I think these numbers show something. If I were a project manager I would say, that situation is not critical yet, but these are not very good signs. And this is one of those nasty situations which can be postponed indefinitely until its too late. Maybe moving some peripheral PRs to volunteers as it was proposed or maybe train new maintainers, I do not know. But currently it looks like in few years core team (which can diminish due to the natural causes) will be overwhelmed under the flow of new PRs or they’ll have to restrict PR access to the chosen group of contributors.

Yet on the other hand situation is not absolutely grim, 2020 is better than 2019, at least by some parameters, but I think it is not enough.

43 Likes

That’s really awesome! Thanks for doing that analysis @skoffer! I’ll pull out the graph I was most interested in seeing from your Pluto vis:

image

That’s the time-to-merge (in hours on the y axis) by quartile. In other words, 50% of PRs were merged within 70 hours in 2020. 75% of PRs were merged within ~2 weeks. On the flip side, 1 in 5 PRs are open for a month or longer. The y axis here is the fraction of PRs:

image

10 Likes

While the numbers are interesting I’m not sure looking at time to close is super useful. PRs aren’t actively closed even if they have been more or less decided against so if there are more PRs opened that are unlikely to get merged (due to the content of the PR) this measurement will increase, even though these PRs might have gotten properly reviewed or commented on.

As a side note, those numbers presented look better than I personally expected.

4 Likes

I’ll just also note for any of you feeling frustrated by a stalled PR:

don’t be afraid to bump it!

Seriously. If it’s been a week with no action, a friendly bump is often all it takes to kickstart it. Speaking personally, I look at issues like this: julia/issues?q=sort:updated-desc. PRs easily fall back and get lost in the slog… we’re not quite at an inbox zero here. Also now that the CI woes are improving, closing and re-opening it will re-run CI.

9 Likes

I would definitely second this. While too much bumping can be annoying, some is definitely necessary to get stuff merged. That doesn’t mean we don’t care, just that we’re busy.

3 Likes

Yes, one should treat this sort of statistics very careful, because it’s easy to deceive yourself. Usually it’s better to look at all numbers together and see them as one big picture. Than different metrics will complement each other.

But I should add that half of closed PRs is exactly that - closed PRs. It means that PRs which are decided against and not being closed for some reasons, usually go to the other half and affect mostly 75%-95% quantiles.

3 Likes

What an amazing statistical analysis @Skoffer! Thank you very much!

1 Like

Would this conversation be in scope for a triage meeting? At this point, it seems like there are a few concrete ideas that have been thrown around as possible improvements to our current developer processes, and it would be a shame to let this drift away without making improvements.

5 Likes

This three year old blog post of mine has some data analysis of closed/open prs and issues. I should update it up to today’s date.

Ah, the good old days of only 1.8k open issues…

18 Likes

Very good analysis @kristoffer.carlsson!

By the way, this GitHub package is amazing!

1 Like

After reading Julia has an issue… blogpost, I’ve added few updates to initial report.

  1. “State of the first PR of new authors” - what is usually happening with the first PR? Well, it is a good thing, that over the years acceptance rate is more or less constant, though in recent years PRs more often left in open state instead of rejected (I am not sure that this is the correct word, it means that PR was closed without being merged). So, yes, combining it with the fact that more then half of new authors produces only single PR it looks like from their perspective their PRs are left in limbo.

  2. More interesting section is “Core and non-core developers”. It shows amount of PRs produced by core developers (well, they also were called mergers and maintainers, sorry for this terminology shifting). One can see that over the years PRs from non core team is growing almost linearly which is really good, and currently 40% of all PRs is from non-core contributors. On the other hand, waiting times for core and non-core team are significantly different, especially it shows in 2019. Only half of PRs in 2019 were closed in 215 hours (~9 days), other half waited longer than that.

Now, these numbers should be considered carefully, because there are multiple reasons why non-core PR wait longer - they may lack proper documentation, testing or some other important things, so this difference is natural, but of course it would be better if waiting times can be made smaller.

  1. I’ve added a couple other repositories for comparison. There was no particular strategy, choice is more or less random. Full list is

    Julia: ⚡ Pluto.jl ⚡
    Rust: ⚡ Pluto.jl ⚡
    Chapel: ⚡ Pluto.jl ⚡
    Nim: ⚡ Pluto.jl ⚡

Rust report is partially broken, due to their maintenance bot.

7 Likes