I don’t understand why in the PR @Mikhail_Kagalenko posted, comments need to be so confrontational, I don’t really find that there is any reason for it. I see that the PR is being made mainly to hold up a point. And it does that well in my opinion, but, at the same time, I see some (weaker) evidence on what the author of the package is instead holding up to. Obviously this is not the best possible anecdotal test since the package is very simple in its interface and other aspects.
It is clear that opinions differ harshly here, and in my point of view, both are still somewhat wrong in some aspects, but not because technically models are unable to contribute to software, which I see it’s still the main point of the conversation. We have plenty of evidence this is untrue, still, I understand why some people are reluctant to use those for non-technical reasons. Not to say that there aren’t also technical failures, quite the opposite, but, still I think it is a net positive in many cases when used wisely. There are though many philosophical, societal and organizational reasons which can really be part of the conversation. Clearly some of them are a bit out-of-scope here because the conversation is somewhat narrow, but some others can be, i.e. responsibility from the part of the author can be.
That was definitively not my point. My point was that none of these topics (curation, LLM tool use, even GMO use) are binary and clear-cut… and that would certainly be true of any labeling effort, too.
This is just not true. We have higher standards than that for registered packages. To the extent that checks can’t be automated, there are no guarantees that we can always identify issues at the time of registration. That doesn’t mean that we don’t strive to have a cohesive ecosystem of high-quality packages in the General registry.
Any of these false dichotomies between two extremes – perfect curation or no review at all – just miss the mark.
I think your assessment of how much is imposed would change if you spent time watching the issues in GitHub - JuliaRegistries/General: The official registry of general Julia packages · GitHub or browsing summaries in #pkg-registration on slack. while the requirements are indeed mostly undemanding and the registry only lightly curated, as has been mentioned several times it is NOT a complete free for all, and indeed package registrations regularly get rejected for all manners of reasons.
It seems the registry may have to evolve its policies with the times. And it seems to me they must be pragmatic and reserve the right to make unilateral changes to those policies. After all, it’s possible to install packages from outside the registry.
I think AI is making it much easier for bad actors to obfuscate their ill-intent, and irresponsible ones to produce unusable slop that only shits the bed and takes up space and resources.
I think this is particularly bad for Julia for two reasons:
It’s relatively new, so models aren’t trained on as much good quality code as for other languages. The more AI generated code there is, the more AI models will eat where they shit and poison themselves into model collapse quicker than in other languages.
The lack of static analysis + the inherent composition and expandability of multiple dispatch means lots of bugs only get caught at runtime. And many of these can be sidestepped by having bad tests with good coverage.
I think there are some concrete things the registry has to seriously consider, which to be fair are things that naturally arise as communities grow, but the need has accelerated due to the rise of AI.
Institute a process by which a package can be removed from the registry.
This includes packages that have been abandoned and remain broken with no willingness to be transferred to new maintainers.
Institute a process by which repeat offenders (ip, username, whatever is less privacy breaking) can be outright banned from registering packages.
Having the ability to quarantine packages suspected of malicious activity (back-doors, shell-popping, remote code execution).
Perhaps with a quarantine hierarchy, for example providing users with a warning and asking the user to confirm their choice to install a package under investigation (defaulting to no), with a link to an issue raised in the registry where information regarding the offenses can be accessed.
Outright preventing people from installing the package until the issues have been investigated/resolved.
Eventually removing the package from the registry entirely.
Banning the offender if they keep infringing the rules.
A way for names to be reclaimed.
A way for people to report or flag suspicious packages, even after registration. If enough reports pile in, the process can be done automatically. They could even fight fire with fire and use AI to do an initial cursory check. After which, the quarantine/removal/banning procedure(s) can kick in.
I think this is reasonable and flexible, but also guards against malicious or negligent actors.
In my mind, AI is nothing but a tool. If that tool is used to hammer in a screw, then that’s an inappropriate use of said tool. The problem is those tools might be used to hammer your custom, precision-machined, non-conductive, anti-static circuit board screws into a reinforced concrete bunker wall with a pneumatic hammer. They can do this at speed, and with little regard for nuance, or respect for people’s time and effort.
As per the use of AI, i don’t think the gini can go back in the bottle. But we can have processes that stop it from trampling over our carefully curated gardens. (Please someone pick up on my reference to persian poetry lol).
It does, and there’s a working group of registry maintainers formulating and reviewing the policies on an ongoing basis (as, for example, in the case of adopting an LLM policy, not too long ago).
No, this is against the stated goal of General to guarantee reproducibility (with the one possible exception of malicious code, see below). This is a matter of policy – registries other than General can handle this with more flexibility
This has come up, and we’ve started work on a system like that. It hasn’t been deployed, partly because there have only been a few instances over the years where this has felt necessary.
That would probably be the one exception for guaranteeing reproducibility (not removing packages). This hasn’t happend yet, though. If it were to happen, there would be an appropriate response from the registry maintainers.
We have a different way of handling this: There are processes in place for handing existing (abandoned) packages to new maintainers. There should be some continuity in packages and their associated communities.
There are various channels for raising such flags… probably mostly on Slack. Unless the issue is malicious code or abuse, the first line of defense is probably to just raise the issue with the repository in question, though.
Please, let’s be honest. I can find right now dozens of packages that do not work anymore (pre-compilation issues, breaking changes from dependencies, etc.). Actually, one of the complaints of my students about Julia is that many packages just do not work, so you need to be active to see which ones are the “good” ones. You get no warning when installing unmaintained packages, for example.
After the package is registered, there are zero checks if semantic versioning is being followed besides the version increment. I can right now change the function name of pretty_table to prettytable and it will get registered.
That’s okay, it is not a problem. We are devoting only our free time to create this. But we need to make things clear. Maybe “curated” has different interpretations in Portuguese and English, but I think we all know the current state of Registry, and there is no need to continue this discussion.
Did I say complete free for all? If you followed the discussion, it is clear that we are trying to impose something like: do not send 100% AI-vibe coded packages that you did not verify by hand. Is it a rightful demand when almost nobody checks for algorithm correctness when something is registered?
I do follow sometimes. I saw problems with names, documentation, testing (although mostly if they are present or not), but we have thousands of packages and I am pretty sure nobody has the time to verify the algorithm correctness that, at the end of the day, matters most. So, if this is not something we are verifying, why should we select one target and focus on it solely?
I agree with everything on that list. If those actions are implemented to every package, then I think we will have a better ecosystem. In this case, we can start discussing how we can avoid bad packages (not only AI-generated packages) in the registry.
IMHO, unmaintained packages that barely work in the current version of the ecosystem must be removed from the Registry. They appear when someone presses TAB in REPL. Those can make much more harm for Julia community than trying to spot if package A was or not vibe-coded.
I understand this. However, if the package is not working and does not support the current version of the ecosystem, is reproducibility that important? I mean, there are packages where the last registered version is from 7 years ago. Can’t we make an effort to clean them up?
If not, I think we will only have a good moment to clean up the registry when Julia v2 is released.
This pretty much addresses all the points i raised. Though what qualifies as extreme cases is vague and up for interpretation.
I still think it would be a good move to allow for package removal. There’s quite a few packages that have been abandoned and don’t work, their creators long gone from github. Doesn’t PkgEval check which packages compile on various releases? It might even make sense to automatically remove packages that have failed to compile since the i-N’th LTS release, where i is the current LTS and N is some arbitrary number of previous LTS releases.
But it’s a good sign these things are being discussed. Who knows, maybe there comes a point where the registry maintainers come round to the idea that maybe scrubbing packages that contribute nothing but bloat is not such a heretical idea.
Indeed this makes me think it would be also possible to add some sort of automatic test to Aqua.jl or similar packages which tests if a package depends from a deprecated/broken package
Yes, it is. There are folks that still work on Julia v1.0 or v1.3. Just this week I re-ran all the Yggdrasil buildscripts (ever in history) and reproduced their build metadata going back to Julia v1.3. And (for all but a few months where we didn’t record the exact Julia version/manifest), it always worked! Heck, the effort itself is in an effort to gain a better understanding of what’s in our ecosystem. You can follow along and help out! You can explore some of those “old” packages yourself. I just randomly looked at three that haven’t been touched since Julia’s v1.0 release, and two are clearly deprecated (with functionality that was subsumed by v1.0), the third’s tests actually still work!
julia> meta = GeneralMetadata.metadata();
julia> pkgs = sort(collect(meta), by=((k,v),)->maximum(((k2,v2),)->v2["registered"], v));
julia> for (pkg, info) in Iterators.take(pkgs, 3)
println(pkg, " last updated at ", maximum(((k,v),)->v["registered"], info))
end
RequiredKeywords last updated at 2018-08-09T10:04:18
UnalignedVectors last updated at 2018-08-09T10:04:18
TropicalSemiring last updated at 2018-08-09T10:04:18
(they were actually last updated before that timestamp, but I don’t care to go back before Julia v1.0, so that’s why they’re all the same)
Let’s assume the case where I write a script that infinitely loops and tries to register packages with random names to General every minute. Clearly, this would be unacceptable behavior, it makes the registry download bigger, it takes up the shared namespace, it creates noise for maintainers, etc., and such use should be prohibited. Trying to argue things like “the General registry is uncurated, so I should be able to register all these empty packages” is clearly nonsense.
Now, with new technologies, it is effectively possible to do the same as described above, just that the content of the packages is slightly more believable of being useful instead of being empty. The human input per package is roughly on the same level, though.
So, the deciding factor is not really about whether the package source code comes from an LLM or not, it is about the time investment made by the author and how much “skin in the game” the author has. A package registered in the General registry is something an author should be proud of, want to maintain, and to improve upon. It should not be a fire-and-forget to pad a GitHub profile.
In my opinion, putting a limit on the number of packages a user can automatically get registered (say once per week, for example), is reasonable. For cases where that has to be overridden, it can be manually done.
I have never said that! What I said is that, since the General registry is not 100% curated (or any other better word to describe the current situation), it does not seem right to ban a package just because it was 100% vibe coded. There will be vibe-coded packages way better than packages that were registered and contain serious bugs. The other way around is also true.
My point in a very simple sentence is: it is not fair to ban packages just because the author select AI as a tool even if 100% of the code was generated by the LLM.
I fully agree with this since it can block those very extreme cases. But please do not put limits on releases (patch, minor, and major). When I update something in SatelliteToolbox.jl, I usually tag a lot of things, and everything is from my GitHub account.
@Ronis_BR, you keep talking about these absolute positions that simply aren’t the case. Yes, some packages have been prevented from being registered on these grounds. Other packages have been allowed. The distinctions between these two cases might get a little fuzzy when it comes to just the code itself. Where there is a clearer line, however, is in the human interaction that happens in the registration PRs and here on discourse.
Be human and respect the humanity of others (that is, don’t inundate others with drivel you’ve not even cared to look over yourself) and you’re in good shape.