Reliability of AI coding tools

You mean, it’s useless to ban AI users, because you’ll cheat?

(Splitting this into a separate thread, since the commentary is not really specific to Swirl.jl or even specific to Julia. I’ve also set a topic timer since this thread seems impossible to resolve on its own terms. This is a topic that people have strong opinions about, understandably, but please try to tone it down to a friendly discussion.)

14 Likes

I think it’s important to make a distinction between vibe coding (trusting the AI output without really verifying it) and using AI “properly” (consider it’s outputs to be PRs that need thorough review and iteration).

You should absolutely be suspicious of vibe coded code.

But when used properly, in the hands of an experienced developer, there’s really no substantial difference between the final result of code written by an AI versus what you would have written yourself or had written by a dev in your team.

It’s best to think of AI as just a tool. You don’t complain about the hammer (AI) when the shonky table (code) you made is wobbly (insecure).

10 Likes

Completely agree. As long as you read, review and modify the generated code where necessary I think it should be acceptable.

2 Likes

I mean usage of AI tools has become so widespread that banning the usage of them is practically impossible. Even if you could, you can never prove someone is using those tools anyway. I even have colleagues in their 50s with 40 years coding experience that are using these tools on a daily basis. And they use C!

1 Like

“Inherently unreliable” depends on a lot of factors:

This was discussed extensively recently in Should General have a guideline or rule preventing registration of vibe-coded packages? (see also the actual guidelines that resulted from this).

Claude and similar tools definitely can be used to positive effect, see The use of Claude Code in SciML repos - #8 by ChrisRackauckas

3 Likes

I see this as a non-issue. If a package doesn’t work or violates copyright, don’t use it and warn people against it, especially in any ANNs. If it made it into a registry, get it out. This isn’t any different from how badly written packages are treated in any ecosystem, even back when people were copying StackOverflow answers instead of throwing LLMs at a problem, and now most of them won’t even disclose the use of LLMs like these developers did.

If there’s no criticism other than how the package was made, you can still choose to not use it. It’s just not a very realistic stance now that major tech companies are openly bragging that they’re using LLMs to replace junior developers boost productivity. I don’t like that LLMs in general are being used to take opportunities away from people working to become real thinking experts, but did the Swirl package really take anyone’s job?

3 Likes

Use of LLM in programming has many advantages, but all those advantages coincide with the advantages of theft over honest labour.

As someone who interacted a bit with the codebase of Swirl.jl a bit I’d like to share a bit of my personal experience.

First, as with most topics in life: it’s not all black and white. Not all generated code is bad, not all uses of this technique are worthwhile either.

To the case at hand: I think we can agree that the generated codebase is different from one that would be developed without. It is more verbose and has lots of hardcoded repetition and lacks abtractions that one typically finds in human generated code. And to me as a developer that makes it unpleasant to work with. Up to a point where I think you can’t efficiently work with the code without using some kind of code assistant and hence I made more issues than PRs since I don’t use them at the moment.
However, this simpler code structure might very well make it easier for people who have less experience in coding (like my students) to work with the codebase and maybe its also a structure that LLMs can handle better, I don’t know.

So would I prefer a human generated codebase? Sure.
But also its an easy to underestimate amount of work to develop a project like this. And then if you don’t do it and I don’t do it a machine generated codebase that does the thing is better than the ideal codebase that nobody writes.
And as with every other open source project the code will improve over time no matter how it started.
In the end I am not completely sold on that the amount of work to turn an machine generated codebase into one that is fun to work with is much less than writing it from scratch, but I think everyone has its own workflow.

TLDR: I don’t like it, I don’t hate it. IMO this package is one of the better uses of coding agents that I have seen.

P.S.: The thing I dislike about the current AI hype the most is that its hard to have a genuine conversation because most people seem to either be on the “humans will be superfluous soon” train or the “kill it with fire” train and it is hard to talk about actual gains and losses, without people shouting at each other.

12 Likes

I think this is great constructive feedback.

A consequence of letting AI generate full packages may be that you get no contributors. It’s a completely valid choice to not spend time on such packages. I would be curious to hear why you decided to contribute anyway, but the topic is about to close.

This is indeed sad. I think outright banning AI for an entire project is wrong. I hope that as AI tools improve, we become more comfortable using them in ways that make sense.

This topic was automatically closed after 18 hours. New replies are no longer allowed.

Before moving on to package-related updates, and since the parallel thread had a timing that didn’t allow me to respond earlier, I wanted to address one constructive piece of feedback from @BeastyBlacksmith here it is:

"This is exactly the kind of constructive feedback I was hoping for, so thank you very much for taking the time to write it.

Let me describe my own profile, which I think overlaps quite a bit with that of many people who use code primarily for research or day-to-day tasks. I’ve been coding for many years, always in the context of data analysis and ML model development, all within computational linguistics. My time is very limited and code is a valuable tool rather than an end in itself. I used to work mostly in Python and R (and still do for certain tasks). I like Julia because of the freedom and expressiveness it gives me. I learned it in small pockets of time a few years ago. But the relatively limited ecosystem, compared to other languages, meant I couldn’t afford to invest the extra time to “bridge the gaps” myself by writing the missing tools and packages I needed.

Then coding agents arrived and last year I started developing my first Julia package, TextAssociations.jl, using Claude Sonnet 3.5. For that package, I used very little agent-generated code. People who browse that codebase will see that the traits you mention—hardcoded repetition, lack of abstractions, etc.—are not really there; there is little hardcoded code and quite a few layers of abstraction. This was my first Julia package and I already see it as something I can introduce to my students so we can build a small team who learn the language and start writing Julia scripts for text analysis.

With Swirl.jl, things had to move much faster. My main motivation was for students to be able to use the REPL to learn Julia at their own pace, so that we could save time in class. I had the idea, I knew roughly the structure I wanted for the package, but I had no time to scaffold a new project from scratch. So you can imagine my surprise when, with a few prompts, the system produced something like 70–80% of what I needed. Even then, I had to intervene many times and spend quite a few hours to make it actually work. I deliberately ignored “ugly” code that worked and focused on what didn’t work, or what needed to change to be practical for my use case. After the announcement, I received many encouraging comments that gave me the motivation to invest more time in making the code a bit more functional (thanks for the issues you opened in any case..)

The rhetoric you describe around the current AI hype was quite new to me in this context. I never meant to “steal” anyone’s code (whose code would I even be stealing, in this case?) or to show disrespect for developers’ work. On the contrary, I was happy to share Swirl.jl with people whose coding creativity I admire and for whom I have a lot of respect.

That’s my story, and I’m fairly confident it overlaps with the stories of many other people. On the positive side, there has only been one genuinely aggressive reaction so far, which is not too bad given the kind of debates that you say that currently surround AI and coding tools.

To sum up, I believe developers can only benefit from seeing domain experts from other fields engage more deeply with their tools and ecosystems—rather than feeling intimidated or ‘robbed’ by that participation. Discouraging people from building their own Julia packages is counterproductive; it risks pushing them away from adopting the X language (julia in this case) altogether, or using it silently without ever feeling welcome enough to return to the X Discourse community."

12 Likes