this post was submitted on 12 Aug 2025
32 points (100.0% liked)

technology

24154 readers
145 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
top 6 comments
sorted by: hot top controversial new old

protect against ai scraping that they can't monetize even though it uses none of their own server time

[โ€“] P1d40n3@hexbear.net 17 points 4 months ago

But not LLM training ๐Ÿค”

[โ€“] ThermonuclearEgg@hexbear.net 14 points 4 months ago (1 children)

This could potentially destroy existing archived data

[โ€“] LargeAdultRedBook@hexbear.net 1 points 4 months ago (1 children)

How so? Do archive services not also archive content from linked CDNs?

[โ€“] ThermonuclearEgg@hexbear.net 2 points 4 months ago* (last edited 4 months ago)

Maybe I'm mistaken but I have heard the Internet Archive applies robots.txt retroactively

[โ€“] FanofOatmeal@hexbear.net 2 points 4 months ago

don't they already?

whenever I looked for old reddit threads on internet archive they never showed up.