Should General have a guideline or rule preventing registration of vibe-coded packages?

Wouldn’t it be of help to have a list of domain experts that could be asked to assess the quality of a package in the registration process? So that it is not only up to a team of one or two people…

Sure, but the easiest way to do that is just for more people to keep on eye on the Slack feed (or the equivalent Zulip feed) and have a look at packages within their domain, or any package in general.

As a “triage member”, I have slightly elevated permissions to, e.g., apply “name similarity overrides”, and closing PRs in favor of superseding PRs. But that’s pretty minimal. I don’t have any authority to “police” the general registry or any expectation that my opinions hold special weight. Anybody is more than welcome to keep an eye on new package registrations and to “triage” them: See if the README makes sense etc. But of course, it’s also perfectly fine to take a deeper look (a proper code review) and raise any concerns. Just keep it welcoming and constructive.

It would be good if there were more people doing that sort of thing.

3 Likes

Well, what I had in mind was more along the lines: treat it as peer review. Invite the domain experts to make a quick assessment of quality. The “triage member” then acts more like an editor. Not everyone is glued to the slack thread…

I think “peer review” is much too high of a bar. I’m also not really willing to take on that kind of commitment of acting as an “editor”. When there is something problematic, it will probably get posted in the #pkg-registration channel on Slack, or sometimes in more specialized channels (“Can the Statistics-community take a look at this submission”). That’s basically the “call to reviewers” you might be talking about. I don’t think we need any fundamental changes to that system at this point.

3 Likes

Hm, from what I’ve heard so far, we do.

I don’t think the basic General registry should be the place for thorough reviews. It’s the de facto way to hook a package into all the great package manager functionality and version control. Restricting access to it seems odd, besides prohibiting name-squatting and malicious packages.

Users should really be in charge of deciding the bar for including a package in their project. We’ve encountered this several times, and what people consider “low quality” varies dramatically across domains and individuals. Arbitrarily making a “low quality” package harder to install would be a strange step.

I think there’s a place for a curated registry of high-quality packages. Going with the journal analogy: we should have a free-for-all arXiv that offers basic infrastructure for the package manager to do its thing (like I think the current setup mostly is), and we could have high-quality registries or curated lists that essentially give packages a badge of honor/quality - only for discovery and for getting newcommers started.

6 Likes

I recently registered PixelMatch.jl which is a Claude translation of a javascript library. This was an interesting case for me because this is probably the lowest-quality code I have registered. For example, the bot did some useless weird stuff with computing hashes before doing comparisons PixelMatch.jl/src/PixelMatch.jl at 522e8472871712cd83cae5de41a4d7e79ff4351e · jkrumbiegel/PixelMatch.jl · GitHub or it does a mostly useless convert into float RGBAs PixelMatch.jl/src/PixelMatch.jl at 522e8472871712cd83cae5de41a4d7e79ff4351e · jkrumbiegel/PixelMatch.jl · GitHub which loses performance.

I noticed those problems but didn’t really have time or motivation to go and fix them once the tests of the original package were passing. The package was at that point solving a real need of mine and I thought even with a bit of shoddy coding and inefficiency, it could still be useful to others. So I went ahead and registered it. I think the usefulness is the important part here, that’s the true measure of any good package.

So I support the idea that was presented further above to have a second registry where anything goes, but the General is treated as more “serious”, i.e. packages have to demonstrate usefulness before getting there. And that would mean being able to demonstrate that others have used it already for meaningful things even though it was in the second registry. I acknowledge though that this would make some things much harder or more burdensome on the maintainers.

7 Likes

On the 2 registry idea, I think it would probably make more sense for General to be the uncurated one, and add a second curated subset, rather than General being the curated one. Because General is not really curated right now, so if we start imposing a high standard and direct packages to a uncurated registry, General would become a mix of old uncurated packages and new curated packages, which doesn’t seem that useful. We also can’t remove packages from General.

Packages can be in two registries without an issue though. So a curated registry that’s a subset of General could be possible, where some packages get registered into both if they meet a bar, and only General if not. The tooling would need some work, but I could see a workflow where all registrations go through General, but some automation automatically mirrors them into a curated registry when some label is applied or they are new versions of a package already in the curated registry. This would allow you to remove General from your depot if you wanted to only depend on curated packages.

8 Likes

I agree, and I would also note that arXiv isn’t completely “free-for-all” (like PyPI/npm are): you need an endorsement to publish on arXiv, and they do some filtering (“crank-checking”, plagiarism scans, and minimum length). It think it’s a good analogy. I also think that the current barriers to entry to the General registry are roughly where they should be, and that we don’t need any dramatic changes there. That’s not to say that we couldn’t increase QA requirements a little bit. I’d definitely be very open to making the existence of tests with some minimum coverage percentage a requirement for registration. That should actually be relatively easy to set up for the bot (“must pass Pgk.test”), and it’s also something that would go a long way in dealing with vibe-coded packages (which usually don’t have comprehensive tests).

I think we could also culturally move towards something where it becomes normal to register packages only once they have gotten a little more mature. The improved tooling in Julia 1.11 and 1.12 to work with unregistered packages (and of course the always-great LocalRegistry package) makes that much easier. Some people have mused about whether the registry should require encourage a 1.0 version number. I still think that goes too far as explicit requirement, but they idea that packages in the General registry should be “kinda stable” and that there is a lot of room for the initial development of a package outside of the General registry is far from crazy.

2 Likes

I think asking for packages to be “kinda stable” before registration would be rather problematic, because at this point they will have an established name, and asking to change it to fit General’s guidelines would cause drama and disruption.

2 Likes

Yes, that’s a very big sticking point, and I’ve been wondering what to do about that. It might be nice if the tools people used to generate packages would already include the same checks that the General registry applies and show a big warning. Or maybe just a web-form “check your name here”, with some very strong and public encouragements to use that website.

1 Like

maybe it could be added as a check in BestieTemplate.jl during package creation?

Yes, that and any similar package, and ideally even Pkg.generate. Probably best encapsulated in Pkg or another dedicated package that all the other “package template packages” could tie into. Obviously, there’ll be concerns about keeping Pkg.generate small and self-contained. It’s a can of worms, and the not the first time the idea has come up. We probably shouldn’t get too far into that here :wink:

1 Like

Is this really true though, now? With the latest Pkg version, you can have all these good stuff and versioning you want on the basis of essentially replacing “version” with “git commit” (which as far as I understand were always one and the same modulo some relabelling). I don’t want to take this off topic, but I think “accessing Pkg infrastructure” is not a strong enough argument to allow registering anything anymore. We can now do it without registering.

In this discussion I found the comment I resonated the most to be “we need a split in the registry to a General all things go in no questions asked and a more curated one” which has been repeated by several members. But this statement isn’t even directly related with the current topic if you think about it, it is something that has been in discussion for years and I’ve always been a huge fan of, along with many others.

On the topic at hand however, I should also point out my confusion regarding what we are even debating about, with respect to the opening thread. E.g., the comment from Keno:

to me this sounds like that you engaged with the code, tested it in a “real world case”, and potentially will (or already had) improved it over time. This doesn’t sound to me like the definition of “vibe coding” given in the very first post here, that is “building software with an LLM without reviewing the code it writes”. Sounds like you have reviewed if not extensively at least just by using it. When I started reading this thread, the opening definition sounded to me more like “ai slop” while Keno’s comment sounds more like “AI-assisted package development”, and these are different things! So I am not sure if we are all on the same page. I’d definitely agree on limiting “registration of ai slop”. The case that Keno or Chris talks about, I would also agree with them that I see no real reason to limit it, provided the usage of AI is sufficiently disclosed in the package docs/readme?

Perhaps GitHub will soon formalize an “AI statement” for repos, which is prominently highlighted just like the license is, so it is easier to see at a glance what sorts of AI involvement has taken place. This could help a lot with some automations for also registering in the General, e.g., we could say if there is some particularly high level of AI usage, it is flagged in the same way as a problematic name. That is, it doesn’t condenm the package or anything, but it asks the author for some sort of justification or comment.

I agree with that, but it has to be a brand new registry. Otherwise you’d have to go and retroactively perform the same filtering to all packages currently registered as at the moment I don’t think all packages fullfill such a property. (if we don’t do such retrofiltering, it would be unfair to gatekeep new packages without any effort to apply same principles to existing ones).

I would propose an alternative: instead of gatekeeping what gets into General, we should have a policy for removing packages that do no belong there.

The discussion above implicitly assumes that once packages are in General, they stay there for good. This is problematic for packages that are substandard (for whatever reason), were once useful but are abandoned, superseded, etc. So focusing on “what gets into General” is understandable, because the cost of mistakes is high, good names are used up for good.

We should complement the entry requirements with a simple process for removing packages from General, so that we can fix mistakes. It should allow the original maintainers to respond and get their act together, with a generous time window if they demonstrate that they are available. The process should be flexible and involve judgement, but generally packages that have no activity, major unfixed issues, started out as an experiment but got abandoned, and have no dependents would be good candidates.

The packages should automatically be moved to another registry so that they remain installable for those who need them. Potentially they would need to be renamed if there are name clashes in that registry (that will happen in the long run).

9 Likes

This breaks reproducibility guarantees— we have no way to make old Manifests on old Julia’s continue to work. So it seems like a non-starter.

9 Likes

Are these reproducibility guarantees everlasting? It seems unlikely that this is even possible in the long run.

For example, if we ever reach Julia 2.0, how important would it be to continue to guarantee reproducibility for code written under Julia 0.x, or even for that code to run at all?

There’s a decent argument for reproducibility here. Even if we reach Julia v5, there’s no inherent reason that a registry of v1 packages shouldn’t keep working for v1 code. That registry just won’t be the default one by v5 anymore, even if something like Pkg still exists by then. I just don’t think it’s reasonable for someone to activate a properly designed environment from 5 years ago just to find that some obsolete dependency was removed, or worse, silently replaced by a totally different package’s overlapping version numbers. So maybe repurposed package names start at a higher major version?

Note that packages are already deregistered for higher Julia versions without compromising reproducibility. That’s necessary for obsolete packages that specified unreasonable upper bounds of dependencies yet used unstable internal names that eventually break. I’d think that stricter upper bounds in the project are only proper for packages that use internals, but that doesn’t always happen and action needs to be taken on the registry side. I’m not even sure if it’s possible to fix a mistaken upper bound on the repo side; wouldn’t a patch just be ignored by Pkg for the unpatched upper bounds?

On the other hand, there may be reasons for complete removal at the cost of reproducibility. I don’t really know the details of copyright, but keeping a package that violates licenses in a registry would poison projects of further users and might not be legal. Maybe LLM-assisted coding doesn’t need to be a direct factor in judgment, but it’s worth recognizing the heightened risks.

2 Likes

Actually Pkg and registries can already handle name clashes even if the user experience won’t be that great. Nevertheless I can’t really see moving packages around between registries being a viable solution.

There have been removals from General, e.g. a recent package that was never installable for anyone, but unless there is a major shift in the policies for General I don’t see any package removals happen for non-technical reasons with less than the package being malicious or having legal issues, and also then on a case by case basis.

3 Likes