Technology

84322 readers

4586 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

199

AI is making it very easy for the government to spy on you. Some lawmakers are worried. (www.nbcnews.com)

submitted 1 week ago by return2ozma@lemmy.world to c/technology@lemmy.world

16 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] partofthevoice@lemmy.zip 4 points 1 week ago* (last edited 1 week ago)

They aren’t using LLMs to do the spying.

LLMs function because we have a technology now that can operate in a space of extremely high mathematical abstraction. Just consider for a moment what you do know about LLMs. They’re trained on massive amounts of text, while fundamentally they operate by predicting the next token (or, word) in a sequence (or, sentence).

An LLM is what you get when you use this method of information processing on natural language.

What if you instead train it on fingerprinting user identities based on web behavior? It doesn’t even output language in this case, now being a different tool operating on the same fundamental information processing methodology.

What if you train a system to automate semantic analysis, which is much simpler than an LLM? Give it categories like “leftist activist” and see what kind of lists they can garner after processing the likes, shares, replies, views, … of every Reddit user that has ever existed? What if you then cross associate users via writing styles, so they can roughly patch up your old Reddit with your new Lemmy — or maybe even your really old Facebook with your old Reddit? What if they further augment that with ISP data that helps really drive these points home?

What if they don’t need tens of thousands of analysts to do this kind of thing for every single American citizen, anymore? Something previously seen as intractable and not worthy of consideration outside conspiracists, now might only require a large enough data center. Surely it doesn’t require a data center with a ballroom on top, but that’s more architectural than anything else.

Edit: let me be more clear about something. LLMs don’t predict the truth. LLMs predict the next token. That being said, they do a really damn good job. Hallucinations are a problem with alignment of that good-job to our expectation of truth — a different issue. So, when you consider the effectiveness of their “spying technology” — do so by comparing it to an LLMs ability to “sound right,” not “be right.”