this post was submitted on 23 Mar 2025
303 points (98.4% liked)
Technology
67422 readers
6687 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This is Ai poisoning. Blocking it you just make it not learn. Feeding it bullshit poisons its knowledge making it hallucinate.
I also wonder how Ai crawlers know what wasn't already generated by Ai, potentially "inbreeding" knowledge as I call it with Ai hallucinations of the past.
When whole Ai craze began, everything online was human made basically. Not anymore now. It'll just get worse if you ask me.
The scary part is even humans don't really have a proper escape mechanism for this kind of misinformation. Sure we can spot AI a lot of the time but there are also situations where we can't and it kind of leaves us only trusting people we already knew before AI, and being more and more distrustful of information in general.
Holy shit, this.
I’m constantly worried that what I’m seeing/hearing is fake. It’s going to get harder and harder to find older information on the internet too.
Shit, it’s crept outside of the internet actually. Family buys my kids books for Christmas and birthdays and I’m checking to make sure they aren’t AI garbage before I ever let them look at it because someone bought them an AI book already without realizing it.
I don’t really understand what we hope to get from all of this. I mean, not really. Maybe if it gets to a point where it can truly be trusted, I just don’t see how.
Well, even among the most moral devs, the garbage output wasn't intended, and no one could have predicted the pace at which it's been developing. So all this is driving a real need for in-person communities and regular contact—which is at least one great result, I think.
Kind of. They're actually trying to avoid this according to the article:
"The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven)."
That sucks! What's the point of putting an AI in a maze if you're not going to poison it?
Whoa I never considered AI inbreeding as a death for AI 🤔
Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.
@RejZoR @floofloof yeah AI will get worse and worse the more it trains on its own output. I can only see "walled-garden" AIs trained on specific datasets for specific industries being useful in future. These enormous "we can do everything (we can't do anything)" LLMs will die a death.