Policies around registering AI-created packages in General?

No. But I have seen enough slop AI and wasted enough time discussing to people that call me stupid, because I still use a keyboard to produce code, while their code then is some longish AI slop
I do try AI stuff every now and then by myself, but usually it slows me down a factor of 2-3 compared to using my keyboard to write the code.
Yes maybe I am stupid. But also tired of being told I am stupid.

I do not mind of someone uses AI as a help, to avoid typos, to structure code. I do have a problem with PRs where tests are sloppy, documentation is longish and hard to read, and everything is low-effort – and leave me with more effort as someone having to review the code.
(Maybe similar to an AI chat bot hotline – they outsourced it to low-effort foo, so I spent an hour convincing the AI that it is correct to pass me to a human).

So for me it is about the code, the effort put into it, and as a maintainer of a package, who later has to work with such code and provide support or further development of the code, to have a code base that is manageable and not a large pile of letter-trash.

while in general A-E is not a quality measure for sure, I have not yet seen code in E that reaches a level I would accept to work with afterwards (which does not mean it is possible to create such code as well in category A for sure :wink: ).

Stopped harvesting my own beans a while back :wink:

Sorry that was a bit sarcastic, I had too many discussions recently that I should do my research, my teaching, my coding,… with AI and if not then I am definitely loosing my job next week. I am very sure I won’t. Quite the opposite, currently students are very lost, especially how to acquire knowledge themselves.
But ā€œAI peopleā€ (the ones I had to talk to) do not yet understand this, namely the one should only use tools if one knows what for and can reflect on the outcomes and whether they are correct.
(Best example: References and literature work in theses, those you can not generate with AI, that is always a huge pile of trash then)

I am a guilty party. Not of fire and forget AI-assisted coding; I eat my own dogfood first. My packages are rather niche and unlikely to generate enough user requests for putting right than I can handle. That’s one mitigating factor. Another is that while AI may introduce fails, I own them. Finally, I have the choice of writing dead simple packages within the experience of a 15-month long user of Julia or of writing packages with more functionality with. I choose the latter.

I’m reminded of my first experience with high priesthoods, in 1965.

Real programmers don’t need the printer ribbon on the cardpunch. They can read the code from the punchcard.

I never attained that degree of mastery. (In fact, I fled from programming and didn’t come back until the advent of the CLI.)

stdlib and the stalwarts of Julia utility, such as DataFrames, for example, should have the highest eyeball quotient. It’s a matter of QC. It really shouldn’t matter whether a package passes with or without AI involvement. The point is that it’s solid. Maybe packages should come with or without a QC Seal of Approval. Someone, not the developer or anyone in backscratching relationship with the developer, could do that or there could be some standardized testing framework. The problem with the latter is that any rules based system can always be gamed. The best results are when a human with judgment, brains and maturity makes the final call.

Who that will be, however, is a possible sticking point. People in open-source communities put in long hours of patient, unremunerated toil for the loving of the game. But there’s a point, and cleaning up other peoples’ messes is probably a bright line. What might work is a QC fee that goes into a pool, divided up among reviewers periodically according to some rough measure of participation.

Opposing viewpoints ready to engage.

In long experience, I’ve found that ā€œmatters of principleā€ fail properly to engage with the facts of the case. Few principles can be expansive enough to cover all instances. Facts and circumstances matter.

Sloppiness with the tool is the offense, not the choice of tool.

I do agree to this very much. For other peoples packages I do not care how much AI they use. It is fully their responsibility.
I would prefer to know that level to some point, because for me a ā€œfully vibe coded packageā€ is still very much on the level that I would not trust it to work properly.
That does not mean they should not do that though. Just that I would like to know.

For my own packages I had the note above, on people using AI when contributing.

ā€œCo-authorshipā€ may have some benefit in market disciplining of agents, but in our context it can be doubted that the effect will be great. As a percentage of AI-involved keystrokes, we are probably the rounding error to the asterisk of the footnote.

Hey if you can’t take it don’t dish it out, I always say :slight_smile: (which is to say no apologies needed). I have made my living writing code by hand for longer than I might want to admit. I first came to discover Julia around 2017(-ish?) when I was looking for the language to use for my thesis research. As a result I consider myself a bit of an expert. You know, like ā€˜don’t try this at home, kids!’, ā€˜professional driver on a closed course’ fine print.

I would like anyone to have whatever opinion they want about the quality of the framework that inspired this discussion. If you like it great, if not … ok, that’s fine.

But umm … it’s really good, and I was able to make because I used these tools. It would never have come into being without them. Not that I don’t feel I’m incapable or anything … just would never have been able to set the time aside.

If you’re a 10x programmer, these tools help you achieve 100x. You probably don’t want to know how many lines of code there have been in the last few months… :grimacing:

If you think that makes me 1) lazy, 2) foolish, 3) brilliant, or any combination thereof, you might be right at some level.

I really don’t know where the line can be drawn. What makes it ā€˜fully’? If I don’t push a physical button down, at some ratio to the bytes in the repo? Hrmmm… just sounds like a distinction on the verge of collapse if the trend continues.

But what really confuses me is why you would trust code written by a human, either :wink: … The only reason I trust any package is because of the robots doing the work. The compiler, assembler, unit and integration tests, etc.

You can not always evade that, when someone starts a 30 minute ramble after me just mentioning that I teach at a university.

I also hate the ā€œ10x programmerā€ term. I teach, I research and I code a bit. I am probably not that good a programmer, otherwise I might have become a programmer.

That is also a social agreement we have to come to.

As an example: If I give the students an intro-level exercise (maybe second week of the semester - ish), the solution I have is about a concise half-page of math. If I then get a solution that is half-correct, fully AI generated and takes 17 pages – and one can see that was one single question to an AI. Should that student get any points? I do not think so. Did the student get angry and tried to discuss? Yes! Should they? I do not think so.

For me personally: If the Readme has the same style as these 17 pages, I do not trust the package.

That I consider borderline offensive. A human with good extensions (which still might include errors) has probable reason and intelligence to write the code, the docs, and tests. Sure, that includes setting up CIs. Those are not robots. That is scripted automated testing.
But the test design and thought behind that is human. This reasoning to test – e.g. mathematical correctness of some functions – still requires human thought.
Machines are not intelligent. AI is good at faking that every now and then.
Using AI as a tool along the way to some extend is of course fine, if you do not stop thinking yourself. But many do.

I stop here, because I reached too often this point, because this is exactly the point where I think all ā€œAI evangelistsā€ try to convince everyone to join the pyramid scheme of AI and to stop thinking themselves, since AI can thing better anyways. And that is where I disagree to the strongest. AI can not think.
And yes that is one of the reasons I left a few social platforms, that by now are just full of AI slop preaching to just use AI for everything and the world would be better. Maybe it is then time to leave at least this thread if not the forum here as well then.

That’s the key insight for me. There are many projects I have sufficient mastery to pretty much know how the entire thing should work, but never had the time to put them together. For these types of projects AI is an insane force multiplier. The actual code I produce ends up being higher quality with better tests, documentation, and build infrastructure. On the other hand, for the applied research type tasks I’m working on, it’s no where near as helpful because both what I need to do and how to do it is often not clear without tons of experimentation and digging. Claude can help you get unstuck, find obscure papers and established workhorse methodologies you may have overlooked, problems in your implementation, but it’s not the 10x I feel when I’m working on these types of projects.

R&D: Time, quality, cost. Choose, at most, two.

One has to cut one’s self with Occam’s razor a few times to see the wisdom of your point. Right of the bat a 17-page README signals something too complicated for one’s own good.

What do you think of my READMEs?

The benefit I see is that the force multiplication you’re referencing ideally provides more time to spend on those aspects, which are perhaps more enjoyable as well.

However, I personally believe that those boundaries are being pushed as well. I work in a research setting and while it’s thankfully not the era of research autopilot today, the number of ideas I can form and iterate on is being multiplied. One’s results seem dependent on the quality of the ideas, of the thinker, at that point.

Whether acceleration of this kind devolves into the cliche below or not is really on the shoulders of the individual. It is both freeing and overwhelming when you realize that the only thing holding you back is yourself…

I like this line very much.
Before: implementing this idea is too complicated and tedious. I will not do this.
Now: I need more time and energy so that I can guide the AI to implement all those in my mind.

I dunno, seems like the complexity of cost of doing something often provides valuable information.

Also, this idea that somehow AI frees you of tedius tasks to do ā€œrealā€ thinking is just a rehash of the promise of every previous technological advance, and yet tedium stubbornly remains.

AI reduces the cost of coding labour, tremendously. Many projects that would normally be considered too much effort, suddenly become feasible. Especially in open source, which is mostly a thankless and unpaid effort. Tachikoma.jl wouldn’t have existed without AI.

That’s another issue with LLM coding tools. They have been trained without paying any
heed to license terms. So you are publishing under the MIT license something
that can be reasonably described as the ā€œderived workā€ from GPL software,
and so are breaking the license.

Exactly! It’s true that we’re an open-source community built by people giving their time to create something amazing. AI is here, just like email was when we used to send letters, or computers when we wrote text on typewriters. The question is: how can we use this to make open-source tools even better?

As we chat about whether an AI-generated package can be submitted to Registry, it’s worth noting that many large companies creating commercial software you use today have a lot of parts with vibe-coded features.

That’s debatable! If I learned C++ by diving into GPL tutorials and then used that knowledge to build a commercial product, would I need to share the source code?

I would like to draw your attention to the fact that growing
number of OS projects ban altogether LLM-generated
code. My proposal of disclosure is far less radical than that.