Suggestion to slightly improve Julia development

(This is a discussion I started on Slack but I think this is a better place)

Hi!

First of all, the Julia community is awesome! Amazing! People are very receptive and really help when the users need. However, I have a small critic (a constructive one, I hope) about the development of Julia itself. I thought it was something isolated, but I have heard people complaining about the same thing at least four times. Sometimes, when someone new tries to do their first contribution (PR, issue, etc.), it can take very, very long time to be replied or approved. See, for example, this minor modification I did:

https://github.com/JuliaLang/julia/pull/33755.

It took roughly 1 year. Unfortunately, for many people, this is discouraging, and they maybe never try to help again. Of course, there is a perfect explanation: core devs are very busy because Julia is very new and they have many exciting features to release, which are very difficult to implement. Hence, wouldn’t it the time to consider increasing the number of people who can review / approve those simple changes? Another good example is this one:

https://github.com/JuliaLang/julia/pull/38447.

It is just a documentation change, yet it has been opened for 13 days without any feedback at all.

If devs decide that they can increase the number of people who can help with those minor issues, then I am available to help, if, of course, the devs think I can to do this.

34 Likes

Hey Ronis, thank you for your contributions! Hearing feedback like this is critical so thank you again for posting.

I think that rather that looking at isolated incidents, it would make sense to do some sort of analysis broadly on all of the PR’s made by non-core contributions. This way we get a holistic and more complete understanding of if this is a widespread issue or not.

We could also perhaps setup a welcome bot to say thanks snd hello to first time contributors (this seems like less work than doing that analysis).

It’s also worth mentioning that the process right now to become a contributor/maintainer is organic. In the case of Python (given it’s been around longer and has more contributors) there is a formal nomination and approval process to become a “maintainer”. I have no doubt that at some point we will get there but it doesn’t seem necessary right now (in my view). Have you seen folks making regular and quality contributors who are bot being given elevated privileges (I don’t watch the PR’s on this closely so asking genuinely)?

3 Likes

Hi @logankilpatrick,

Thanks for the feedback!

Yes, this seems reasonable. I really cannot state for sure whether it is a widespread issue or not. Just four people comment about something related to me, and I found some very old PRs without any feedback whatsoever. PRs that are minor, like improvements in English, small additions, etc. I also experienced this kind of radio silence at least twice (however I am used to reviewers taking 1 year to return my papers, so I was fine with this :sweat_smile:).

Makes sense. Anyway, I am still a volunteer to help you if you need :slight_smile:, and I am sure that there is a lot of people that would happily divide the workload on those minor issues, leaving the core devs to implement those great new features in the future versions!

I think I did not understand. Can you please clarify?

4 Likes

What I was eluding to is that there may not be enough people who have hit that organic level of trust to be promoted to the role of a maintainer. See here: https://github.com/JuliaLang/julia/graphs/contributors?from=2019-12-25&to=2020-11-28&type=c and based on my quick glance thought, most of the folks listed here already have commit access or have not been making contributions consistently.

2 Likes

Oh, I see! But let’s assume that I am right: there are some problems that make someone new to truly get involved into Julia development. In this case, those metrics would not show the problem because the problem itself causes those metrics.

What I am trying to propose is to create some kind of “Team B” or whatever one wants to call it, to handle only minor PRs. Something like this:

When a PR is opened with a minor thing (like a documentation update, a small feature, etc.), it is marked with a specific label. In this case, if two reviewers from this “Team B” approves it, then someone with write access accepts without questioning or reviewing.

If I am right, then it will alleviate the work of the core devs, and Julia will get slightly better :slight_smile:

You might question: do we have people that fall in this Team B category? Of course! I do not have the expertise to review something related to the compiler, or an entirely new library to the stdlib, but I am confident that I can review things like improvements in printing, a minor improvement in computing some mathematical function, etc.

Now, let’s suppose that I am wrong, then a slight development change like this will not harm at all since Team B does not have write access. In fact, my proposal would be way better if Github has a method to allow someone without write access to approve only specified PRs, but I think this is not available.

6 Likes

Anyone with a GitHub account can review PRs, you don’t need any elevated permissions for that. Reviews from non-committers are often helpful, but quite rare. Pressing the actual button is not the difficult and/or time consuming part here. If you want to help out core devs feel free to review PRs where you think your input would be helpful.

16 Likes

I think one easy improvement would be to officially document that reviews from non-maintainers are a productive way of contributing.

10 Likes

Hi @fredrikekre,

I think you completely missed my point. I am not talking about reviews as in Github reviewer system, but reviews as a trusted group of users. Let’s say user @alsdf12710293 logs into the Github account and approves a PR. Will you, with write access, blindly accept it? No. You will open, check the modifications, look at the test results, if those fails, verify if it is related to the PR, etc. This is the time-consuming part.

IMHO, if you have a set of trusted reviewers, you will look at the PR, see the approval of two trusted reviewers, and then just blindly accept the PR. In this case, I agree that the press of a button is not time-consuming. Julia community is full of people who can do this for minor things, but if they are not doing, as you said so, something may be wrong.

This is precisely my point. Only to me, four people commented that they tried to help Julia in some ways and just got ignored. If those newcomers to the community think that their time is being wasted, they will just stop…

Can those four cases be outliers? Sure! But it can be something else.

9 Likes

IMHO, the problem is not the documentation, is the action. Let’s say someone goes now into Github and approves (correctly) 20 PRs. How many of those will get merge “fast”? It can be really frustrating for some newcomers to take time, improve something (which can be a typo), and submit to just see months go by without feedback for a trivial modification.

6 Likes

No of course not. But if @alsdf12710293 reviews and says “Hey, you made a typo here” or “Maybe it would be better do implement it this way?” and the PR author changes something, that is helpful. When a core developer gets time to look at the PR it is in a better shape, and it makes it more likely to get merged.

I would never blindly merge a PR, even if it was approved by Stefan or Jeff.

Again, it is not about seeing “approved” from random users, but if random user spots a mistake and comments on it, that means less work for core developers.

10 Likes

Those cases I described took months for the core developer to even look at it. Fun fact, one of them was approved with just a rebase. IMHO, what is missing now is manpower to look, review, and approve, not manpower to make “better” PRs. Better PRs will come with time, as more people are attracted to the community.

But that’s precisely the problem. I think we are running in circles here.

It seems people with write access are overwhelmed, and some trivial things are taking too long (see the two I listed above). Again, my idea is to increase the number of trusted people who can approve a PR to be merged. I am not advocating to increase the number of people with write access to the repository for obvious reasons. Just to create a group of people with the right (and responsibility) to say “yes, this PR is good to be merged”. In this case, someone can just merge; no more reviews will be necessary (theoretically, a bot could do this).

I am just trying to make a proposal to improve development. My suggestion is based on what I have always seen in the open-source projects I worked on. For example, when a young student wants to work with me, I will not expect a full thesis on orbital mechanics. They need to learn Newton’s laws first. But they do need positive feedback, so the interest is not lost. You need to agree with me that one year of no feedback for a minor modification does not seem good feedback, right?

Anyway, if you think that everything is fine as it is, OK, no problem.

8 Likes

Hi @logankilpatrick. For those us who are part of the the community for such a long time, this issue is real and widespread. I am 100% with @Ronis_BR here.

Contributing to the language (by any means) has not been a pleasant experience. I have colleagues, who like me, stopped submitting issues and PRs because either (1) the core devs were not friendly or (2) the PRs got forgotten as we are not part of the “core team”.

Anyways, I am just showing my support to @Ronis_BR because I’ve been there before, and it is quite annoying to fight this battle alone when the core team doesn’t acknowledge the issue.

19 Likes

Sure, but anyone can already contribute to this?

Okay, but as I said:

So consider that core developer X has time to look at your PR once a week. If the PR has a typo and X comments on that, then there might be another week until that PR is looked at. If helpful random user had already commented about this typo, and the author had fixed it, perhaps it could have been merged the first time the PR was looked at.

But if someone has the power to approve a PR and that will be merged blindly, that is essentially write access. And as I said before, for me personally, I would always review it myself anyway.

I didn’t say that, I said:

5 Likes

I think it’s pretty understood that Base reviews are pretty slow in comparison to other areas of the ecosystem. There are trade-offs to this, so it’s not necessarily bad per se, but it is one of the reasons why a lot of development prefers to take place in packages. Even core developers will create things in packages before opening a PR to add it to Base in a lot of cases, so it’s a pretty widely acknowledged phenomena.

But it’s not necessarily bad. It used to be quicker too, but Julia is fairly well-established now so any changes need a lot more scrutiny to not be breaking. And there’s a whole process to ensure that the changes in Base don’t break packages, with the whole PackageEval system. That stability of the language is one of the key things that makes its modern development work so well, and I think it’s something to simultaneously appreciate.

I think one thing that leads to this feeling though is that @Ronis_BR has had good instances with SciML, where in SciML you can pretty much expect your PR to get reviewed in a day or two and likely merged. But that’s because SciML has a fairly active BDFL (me) interfacing with the “outside” community on a very regular basis. I can tell you from this that it takes a ton of time and it won’t scale to the size of the Julia language, which is why in a lot of SciML I am training officers to really take ownership of specific areas, like ODEs and parameter estimation, so that it’s possible to keep up with the influx of PRs (which during the summer is hitting about ~20 reviews a days due to the GSoCs!).

I think from my experience in the package world, that’s what really matters: ownership. And you see it in this discussion as well: people need to feel like they have ownership in order to merge. Giving more people that feeling makes things move faster, but if ownership is too relaxed then you might have people merging who might not know all of the consequences of the code.

I think that one way to do this right is to have more officers and designations of ownership in certain areas. If you look around enough you can figure out who “owns”/maintains LinearAlgebra, but it’s not very accessible. It would be nice to have a file that gives a little bit more structure to this. That way newcomers could more easily figure out who to ping and get help getting things merged. Of course that can be abused, but something like, “hey it’s been 2 weeks, @kslimes master of tests, can you help me get this unit test merged?”. I think it would 2 things:

  1. help people who are core maintainers triage what PRs to look at
  2. figure out where we are lacking maintainers

Who’s the person to ping if it’s just a documentation change? I wouldn’t actually know right now. If there is no one, then maybe this could be something that a big of NumFOCUS funds could help out on? (Especially since docs are probably more accessible than other areas. This wouldn’t necessarily be possible in all aspects of the project)

So there’s a reason for this, but it can probably also improve.

34 Likes

Thank you for your feedback! Just for the record, @juliohm was not one of those people I talked to. The feedback @juliohm provided here seems very similar to what I heard.

Yes, I understood your point. Sorry if I am not very clear to what I am trying to say. People are helping, but the feedback is not being good for the newcomers. We are loosing potential contributors for a very small reason.

Yes, but this will happen only for minor changes. That’s why I suggest to use a label or something. I really think that Julia community has grown large enough so that there are people out there that can handle some of the minor modifications without problem.

4 Likes

I think there’s several issues:

  1. These days API additions need to be carefully considered, since we don’t have the luxury to undo things in the next minor release anymore
  2. CI has been a mess for the past couple of months, causing “minor” PRs to build up, because people are not comfortable merging PRs that fail CI
  3. People are busy (November has been particularly bad, with several core folks distracted on other issues)
  4. Things just fall through the cracks
  5. We’ve been holding a lot of things for 1.6, but 1.6 release process has turned out to stretch on quite a bit

Overall, I do agree that we have a bit of a velocity problem, particularly these past couple months, but I don’t think people with commit access is really the bottle neck here. In particular, I’m pushing hard on getting CI back in shape, which should hopefully ease some burden. Also, as @fredrikekre said, review comments by non-committers are perfectly reasonable and also a good way for folks to prove that they are to be trusted with commit access.

22 Likes

Thank you for your comments @ChrisRackauckas! I fully agree with your points.

Furthermore, people that are in charge of some parts can recruit others to help. This approach is very similar to what we have in openSUSE.

5 Likes

Hi @Keno

Thanks for your feedback. Just to clarify two points:

  1. I did not mean to add more people with commit access, but to add more trusted reviewers for minor issues. It is like “commit access for minor changes”. Of course, API additions are far away from what I am calling “minor”.
  2. I understand that this November is being bad (this entire year, actually), but I am afraid this perception from newcomers is not restricted to this year only.
6 Likes

My sense is that minor PRs without issues will start getting merged again once CI is in better shape. They suffer particularly from CI instability, because the time required to validate that they don’t accidentally suffer from unexpected issues found on CI exceeds the time required to actually review the PR.

10 Likes

Thanks @Keno, this really makes sense!

Just one addition to think about: If the core devs are very busy, isn’t it one more argument to my idea that it is the right time to begin decentralizing the development of the peripheral parts?

I contributed to Gentoo and openSUSE, besides a lot more much smaller projects, and they seem to follow this kind of development approach. People are selected to work in small, peripheral areas, so that they can understand how everything works and starts to make more substancial contributions in the future. This is a good way to attract more people to the project!

4 Likes