this post was submitted on 04 Aug 2025
55 points (100.0% liked)

Technology

327 readers
224 users here now

Share interesting Technology news and links.

Rules:

  1. No paywalled sites at all.
  2. News articles has to be recent, not older than 2 weeks (14 days).
  3. No videos.
  4. Post only direct links.

To encourage more original sources and keep this space commercial free as much as I could, the following websites are Blacklisted:

More sites will be added to the blacklist as needed.

Encouraged:

founded 2 months ago
MODERATORS
top 7 comments
sorted by: hot top controversial new old
[–] TomasEkeli@programming.dev 5 points 1 day ago

A sure sign that they are a nefarious company.

[–] kayohtie@pawb.social 2 points 22 hours ago

Perplexity's firing back assumes website owners distinguish between automated scraping and on-demand scraping.

I don't think most people make that distinction.

And that falls in line perfectly with the typical "assumption of access" all of these "AI" companies make.

[–] beeng@discuss.tchncs.de 3 points 1 day ago

Perplexity fired back in their blog.

Pretty tasty.

[–] some_guy@lemmy.sdf.org 14 points 1 day ago

In other words, they’re assholes.

[–] CarbonatedPastaSauce@lemmy.world 9 points 1 day ago (1 children)

The only surprising thing to me from this article is that OpenAI actually follows the rules for bot crawlers.

[–] 0_o7@lemmy.dbzer0.com 5 points 1 day ago (1 children)

Or they haven't been caught yet.

The article explains PerplexityBot respects robots.txt, but then sends a different request with a different IP and different user-agent. They could very well be using a different method to walk around it.

The article explains how they tested for that, and as far as they could tell OpenAI is respecting the rules.