Today I’ve seen a couple of posts from newcomers which very much read like they have been generated by an LLM (overconfident, generic, sometimes weird mistakes).
Do we have any policy for how to respond to such posts? We certainly don’t want to be drowned in LLM-generated content but it’s not entirely easy to avoid false positives and it would be quite rude to accuse a genuine newcomer of not writing their posts themselves.
If they’re spammy, delete and block. If they may be genuine, but give off a strong LLM vibe, maybe send a friendly personal message. It might be a challenge to find the right tone to engage with the author about what kind of contributions we’d like to see, i.e., posts with a more personal character. As you say, it could come off as quite rude to accuse someone of posting LLM slop when they didn’t use an LLM.
On the other hand, I’m not sure how strict one needs to be with anti-LLM policies. I feel like the author of a post is responsible for the content they post, no matter what. I don’t necessarily care what tools they used to write their post. If they got help from an LLM, they should at least review it and probably edit it so that it’s something they can personally stay behind. If they post high-volume low-quality garbage, they should be blocked, whether or not an LLM was used in writing the post.
Basically, instead of trying to make guesses on whether something is LLM-generated, just consider every post on its own merit. But maybe I’m underestimating the magnitude of the problem. LLMs could certainly drown a forum in garbage. If that really becomes a serious issue, Discourse will probably have to tweak their “trust level” system a bit to fight that.
It’s odd, it’s the same firstname###lastname spam we had a few weeks ago already, just this time they’re not including a link to some website in a punctuation mark or something (as far as I can see).
Some folks here use language models for natural language translation, which can make this a little trickier, but yeah. As mods, we can also see several other signals of inauthentic new users… so please flag away!
Why is it even possible that a bot can post? Is there no straightforward way to prevent this? Have all these captchas over the years been for nothing but AI training? Or is the question about what to do when people copy text from an LLM response into their post?
I don’t think they’re actually fully autonomous bots, but rather actual people driving semi-supervised scripts and automations that allow them to rapidly create throwaway IPs, emails, and accounts.