Fall in GitHub code frequency


I work extensively in deep learning projects and had known Julia for quite a while. Today, I was skimming through code frequency graph and the fall really concerned me. Can I know if is something that I should worry about when I think about Julia is going in the right direction to achieve what it it was really meant to achieve?


Why does this matter? This plot doesn’t include the package ecosystem which is where the vast majority of the work is.


Can you really measure commit frequency by eye in that graph or you got the data and analysed it? Because what I see by eye is that large code refactoring (high spikes in both additions and deletions) became less common after mid-2018, which simply reflects the fact that in August 2018 version 1.0 of Julia was released and after that development of Julia itself became less hectic and refactoring the code got much less likely.


Heh, a lot of the big spikes are actually people accidentally committing something and then having that immediately deleted (plus a few actually big changes - e.g. the manual format conversion). A lot of that doesn’t happen anymore.


Ah, good that I have asked it! I am quite new to open source and I misinterpreted the graph. I thought there might be atleast some correlation between the spikes and the progress made in Julia and I am happy that there isn’t any. Thank you. But I cannot deny the fact that for someone (such as me) who is relatively new to open source, the graph can be misleading

The other thing to note is that lines of code is only correlates loosely with development effort. If I do a refactor that moves 1000 lok from 1 file to another, that is way easier and often less impactful than writing 10 lines of new code.


This is something I’ve thought about to some extent…

I think at some point, when a programming language/project or an ensemble of those begins to reach some level of maturity, the size of the changes and even the frequency of those changes should decrease over time barring complete rewrites (eek). Even with new package creation. I haven’t sat down and worked out any models for it, but maybe it would be something fun to do.

As a surrogate, we do see some decay in the number of OSS contributions over the past few years. That is what should be happening! More complex behavours can be added with 2-3 lines of code(import/dependency/compat/etc). So seeing smaller contributions is not a bad thing, actually it can be a sign things are working as expected.

1 Like

Note that the spikes are huge anomalies. Like major external libraries we used to check into the repo in the very early days that then obviously were refactored out into separate packages. They are large enough C codebases that they make everything else look small in comparison.