this post was submitted on 29 Jan 2026
59 points (100.0% liked)

Fuck AI

6174 readers
2421 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.

founded 2 years ago
MODERATORS
 

Today I set up a little project website on a new subdomain. It's not a www subdomain or a newly registered domain, which is easy to detect. We're talking about:

Randomchars.mydomain.com

Within 20 minutes, the anthropic ClaudeBot was on it. I could tell because the nginx access log showed a hit to robots.txt and then a handful of pages.

First off, how the hell did they find it? Next, is my DNS provider, Amazon Route 53 selling this kind of data now? Or is there some kind of DNS wildcard query?

top 10 comments
sorted by: hot top controversial new old
[–] Courantdair@jlai.lu 27 points 1 month ago

SSL certificate could leak subdomains depending on how it's configured. Otherwise I wouldn't be surprised if Amazon sold this info within 20min

[–] psycotica0@lemmy.ca 9 points 1 month ago (1 children)

What's that thing Google is pushing, where the CAs basically push a list of all the certs they issue? Is that live? Maybe Amazon issued you a key, and then published it in a list of "domains I've issued keys for", and they're just watching that list?

Unless that's not a thing, or not a thing yet, or I'm fully misremembering...

[–] cron@feddit.org 6 points 1 month ago

That thing is called „certificate transparency logs“

[–] thisbenzingring@lemmy.today 6 points 1 month ago

if you are using any Amazon web services, those IPs are all well known and if one is not active but then starts reaching out, it will be seen and if it is not in a collected database, it will be added to be collected

[–] Taldan@lemmy.world 5 points 1 month ago* (last edited 1 month ago) (1 children)

I'm a bit confused about your DNS config. DNS is generally public, that's the point of it

AI scrapers, like most scrapers, just crawls every new DNS entry that is created

[–] cron@feddit.org 4 points 1 month ago (1 children)

How exactly can you find all subdomains of a given domain?

Sure, it is possible with misconfigured DNSSEC (zone walking), but otherwise I‘d say it is not possible.

[–] ramble81@lemmy.zip 5 points 1 month ago (1 children)

I could see accidentally having XFER enabled (not even DNSSEC related) and they transfer your entire zone.

[–] cron@feddit.org 2 points 1 month ago

Thats possible, though probably not the most likely option (misconfigured webserver, certificate transparency logs).

[–] techconsulnerd@programming.dev 2 points 1 month ago

Perhaps it was crawling a list of IP addresses and your web server is also serving the website to your IP address (not domain/subdomain). You can configure the web server to show blank page or 403 error if accessed by IP address.

[–] SpaceMan9000@lemmy.world 1 points 1 month ago

What's the default in nginx? Did they need to know the actual subdomain or a lot of times you can get it by querying the DNS servers directly or have certs leak it.