this post was submitted on 23 Mar 2026
321 points (98.8% liked)

Technology

82992 readers
3304 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] errer@lemmy.world 1 points 22 hours ago (3 children)

Wikipedia probably wants to sell access to LLMs to train. It’s only valuable if Wikipedia remains a high-quality, slop-free source.

I think even AI zealots think there should be silos of content to train from that are fully human generated. Training slop on slop makes the slop even worse.

[–] Grimy@lemmy.world 12 points 21 hours ago (1 children)

Sell licenses of what? It's already all in the creative commons iirc.

[–] Zagorath@quokk.au 3 points 17 hours ago

The content is CC licensed, but they are trying to block AI scraping because it overloads their servers. They have a paid API that uses a lot less compute for both Wikipedia and the AI, as well as being a revenue source for Wikipedia.

[–] SuspciousCarrot78@lemmy.world 7 points 21 hours ago

AI already trains on Wikipedia.

https://commoncrawl.org/

[–] MountingSuspicion@reddthat.com 5 points 20 hours ago

This was only done because the editors pushed to minimize AI involvement. There's a comment here already mentioning that: https://lemmy.world/comment/22826863