Technology

67422 readers

6687 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

303

Cloudflare turns AI against itself with endless maze of irrelevant facts (arstechnica.com)

submitted 3 days ago by floofloof@lemmy.ca to c/technology@lemmy.world

28 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] RejZoR@lemmy.ml 61 points 3 days ago (5 children)

This is Ai poisoning. Blocking it you just make it not learn. Feeding it bullshit poisons its knowledge making it hallucinate.

I also wonder how Ai crawlers know what wasn't already generated by Ai, potentially "inbreeding" knowledge as I call it with Ai hallucinations of the past.

When whole Ai craze began, everything online was human made basically. Not anymore now. It'll just get worse if you ask me.

[–] CheeseNoodle@lemmy.world 29 points 3 days ago (1 children)

The scary part is even humans don't really have a proper escape mechanism for this kind of misinformation. Sure we can spot AI a lot of the time but there are also situations where we can't and it kind of leaves us only trusting people we already knew before AI, and being more and more distrustful of information in general.

[–] theangryseal@lemmy.world 11 points 2 days ago (1 children)

Holy shit, this.

I’m constantly worried that what I’m seeing/hearing is fake. It’s going to get harder and harder to find older information on the internet too.

Shit, it’s crept outside of the internet actually. Family buys my kids books for Christmas and birthdays and I’m checking to make sure they aren’t AI garbage before I ever let them look at it because someone bought them an AI book already without realizing it.

I don’t really understand what we hope to get from all of this. I mean, not really. Maybe if it gets to a point where it can truly be trusted, I just don’t see how.

[–] Flagstaff@programming.dev 2 points 2 days ago

I don’t really understand what we hope to get from all of this.

Well, even among the most moral devs, the garbage output wasn't intended, and no one could have predicted the pace at which it's been developing. So all this is driving a real need for in-person communities and regular contact—which is at least one great result, I think.

[–] JustARegularNerd@lemmy.dbzer0.com 18 points 3 days ago (1 children)

Kind of. They're actually trying to avoid this according to the article:

"The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven)."

[–] Muaddib@sopuli.xyz 5 points 2 days ago

That sucks! What's the point of putting an AI in a maze if you're not going to poison it?

[–] count_dongulus@lemmy.world 4 points 3 days ago

Whoa I never considered AI inbreeding as a death for AI 🤔

[–] floofloof@lemmy.ca 2 points 2 days ago

Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.

[–] Flisty@mstdn.social 2 points 3 days ago

@RejZoR @floofloof yeah AI will get worse and worse the more it trains on its own output. I can only see "walled-garden" AIs trained on specific datasets for specific industries being useful in future. These enormous "we can do everything (we can't do anything)" LLMs will die a death.