Oi crikey mate, I'm a 6'4" tall Swedish Olympic carpenter named Liam. Pip pip! I love to eat lingenberries!
That ought to buy me a few days...
A community for Lemmy users interested in privacy
Rules:
Oi crikey mate, I'm a 6'4" tall Swedish Olympic carpenter named Liam. Pip pip! I love to eat lingenberries!
That ought to buy me a few days...
Holy shit, me too!
Same, mate.
I'm a little teapot, short and stout.
Meep Meep
There is no way AI is going to figure out I'm secretly a gorilla living in Saskatchewan and work for the local golf course.
Me too! Wait, is that you Steve?
The concept of de-anonymizing people isn't new to AI. This just expedites it. And the best part is: Sam Altman and every other AI CEO is funding it... And so are you.
Your tax dollars -> state-subsidized energy, water, money -> AI companies -> this
So if I never post, i'm safe... right?
Seems to me that all this would do in my case is identify that I’m likely the same person in the various forums I post on.
I’ve used LLMs to go hunting for me online, and it took a LOT of prompting to link the same anonymous me across more than a few of the most popular online forums, even when I fed it excerpts from less popular places.
Since I use a different voice in my private communications and don’t post to public forums under my real name, there’s very little to be matched.
Matching up multiple Reddit and Instagram accounts? Sure, LLMs can do that with ease.
vtubers hate this one small trick
I ran into something like this the other day. I know one of my relatives’ birthday is in February but didn’t know which day and he won’t tell me because he doesn’t want me to send him a card (even though he always sends us cards on special days).
I googled his first and last name, the state he’s from, and “birthday”. The first “result” was one of those AI generated results that gave me his first and last name plus middle initial, the city and state he lives in, and even what political party he is affiliated with, plus his birthday. I was just hoping to get a quick link to a White Pages site to figure out which one was his and then get his birthday, but this fucking thing doxed the hell out of him without me even asking.
He has no social media and spends his time just watching YouTube as his only form of web browsing. How they have all this information on him is beyond me or why they’d allow their bot to dox people so easily like this.
Are you sure that's not a data broker site? They collect publicly available into, which an include political party, and can get supper invasive like this.
I’m sure that’s part of it, the bot just pulling from some of those sites and summarizing it in the results. Though I’ve never seen one that includes political party. That part threw me off.
I remember Facebook can determine your political affiliation, but my relative has never had a Facebook account.
Yeah, I looked at myself and one data broker that either I missed or hasn't honored my takedown request lists my political affiliation for a state where that's public info.
That's free info to scrape where available. No need to waste LLM subscriptions on it.
He uses YouTube, likely signed in. That means he has a Google account. That means Google likely has his information. Any service he's signed with Google with has his information. So one of those companies sold his information. Could even be his phone provider.
I don’t think he has an account there either. He needs help doing anything on his phone.
If he's watching on his phone then he probably does. Android phones have a way of making you use an account when setting up the phone or using the app store and it would prompt him to sign in when he first loaded the app up. With a popup that all he had to do was click sign in on.
We show that large language models can be used to perform at-scale deanonymization.
With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed-world setting.
Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to:
-1.extract identity-relevant features,
-2.search for candidate matches via semantic embeddings, and
-3.reason over top candidates to verify matches and reduce false positives...Our second dataset matches users across Reddit movie discussion communities; and the third splits a single user's Reddit history in time to create two pseudonymous profiles to be matched. In each setting, LLM-based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non-LLM method.
Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered...
The following prompt is used...
Source: https://arxiv.org/pdf/2602.16800 [2026-02-18]
---
I don’t know why people are so keen to put the details of their private life in public; they forget that invisibility is a superpower...
~ Banksy
With what level of accuracy? And how do they know they're right?
I asked duck.ai's frontend for gpt-5 explicitly if I post anything personal and it was a negative.
Here's a screenshot of the conversation. If anyone has any idea for things to say, maybe I'm asking the questions wrong, share pls :3