this post was submitted on 11 Jan 2026
263 points (98.5% liked)

Technology

78627 readers
4799 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Alarmed by what companies are building with artificial intelligence models, a handful of industry insiders are calling for those opposed to the current state of affairs to undertake a mass data poisoning effort to undermine the technology.

Their initiative, dubbed Poison Fountain, asks website operators to add links to their websites that feed AI crawlers poisoned training data. It's been up and running for about a week.

AI crawlers visit websites and scrape data that ends up being used to train AI models, a parasitic relationship that has prompted pushback from publishers. When scaped data is accurate, it helps AI models offer quality responses to questions; when it's inaccurate, it has the opposite effect.

you are viewing a single comment's thread
view the rest of the comments
[–] SinningStromgald@lemmy.world 8 points 2 days ago (2 children)

Now that news has reported on the website I assume it will get added quickly to do not scrape lists for AI (assuming there is such a thing). So the effectiveness of this will depend on other people adopting this.

[–] floofloof@lemmy.ca 11 points 2 days ago

They're recommending not that you link to their URL but that you create a back end that caches content from it and serves that content under your own URLs.

looks like a way to stop AI scraper