technology

24249 readers

373 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.

founded 5 years ago

MODERATORS

context@hexbear.net

SexUnderSocialism@hexbear.net

gaycomputeruser@hexbear.net

Wakmrow@hexbear.net

SwitchyandWitchy@hexbear.net

100

Why AI writing is so generic, boring, and dangerous: Semantic ablation (go.theregister.com)

submitted 2 days ago by happybadger@hexbear.net to c/technology@hexbear.net

59 comments fedilink hide all child comments

cross-posted from: https://ibbit.at/post/178862

spoiler

Just as the community adopted the term "hallucination" to describe additive errors, we must now codify its far more insidious counterpart: semantic ablation.

Semantic ablation is the algorithmic erosion of high-entropy information. Technically, it is not a "bug" but a structural byproduct of greedy decoding and RLHF (reinforcement learning from human feedback).

During "refinement," the model gravitates toward the center of the Gaussian distribution, discarding "tail" data – the rare, precise, and complex tokens – to maximize statistical probability. Developers have exacerbated this through aggressive "safety" and "helpfulness" tuning, which deliberately penalizes unconventional linguistic friction. It is a silent, unauthorized amputation of intent, where the pursuit of low-perplexity output results in the total destruction of unique signal.

When an author uses AI for "polishing" a draft, they are not seeing improvement; they are witnessing semantic ablation. The AI identifies high-entropy clusters – the precise points where unique insights and "blood" reside – and systematically replaces them with the most probable, generic token sequences. What began as a jagged, precise Romanesque structure of stone is eroded into a polished, Baroque plastic shell: it looks "clean" to the casual eye, but its structural integrity – its "ciccia" – has been ablated to favor a hollow, frictionless aesthetic.

We can measure semantic ablation through entropy decay. By running a text through successive AI "refinement" loops, the vocabulary diversity (type-token ratio) collapses. The process performs a systematic lobotomy across three distinct stages:

Stage 1: Metaphoric cleansing. The AI identifies unconventional metaphors or visceral imagery as "noise" because they deviate from the training set's mean. It replaces them with dead, safe clichés, stripping the text of its emotional and sensory "friction."

Stage 2: Lexical flattening. Domain-specific jargon and high-precision technical terms are sacrificed for "accessibility." The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym, effectively diluting the semantic density and specific gravity of the argument.

Stage 3: Structural collapse. The logical flow – originally built on complex, non-linear reasoning – is forced into a predictable, low-perplexity template. Subtext and nuance are ablated to ensure the output satisfies a "standardized" readability score, leaving behind a syntactically perfect but intellectually void shell.

The result is a "JPEG of thought" – visually coherent but stripped of its original data density through semantic ablation.

If "hallucination" describes the AI seeing what isn't there, semantic ablation describes the AI destroying what is. We are witnessing a civilizational "race to the middle," where the complexity of human thought is sacrificed on the altar of algorithmic smoothness. By accepting these ablated outputs, we are not just simplifying communication; we are building a world on a hollowed-out syntax that has suffered semantic ablation. If we don't start naming the rot, we will soon forget what substance even looks like.

you are viewing a single comment's thread
view the rest of the comments

[–] miz@hexbear.net 7 points 1 day ago* (last edited 1 day ago) (1 children)

what's a go-to line of questioning that makes it shit the bed

[–] KuroXppi@hexbear.net 2 points 6 hours ago* (last edited 6 hours ago) (1 children)

I watched this series with a guy asking LLMs to count to 100:

https://www.youtube.com/watch?v=5ZlzcjnFKvw

If it can fail at something so obvious, why would anyone trust it with anything they don't understand and can't see the mistakes which will definitely be there but you can't see.

It's like if someone lied straight to your face about stealing ten dollars, then you trust them to do your taxes.

(Note: even when it does manage to count (non sequentially) to 100, it still fails because it repeats some numbers, so on a surface level someone may look at the output, see 100 is in the final place, and assume it was correct throughout, they'll pat themselves on the back and say 'good on me for verifying' while the error is carried forward. So even when it's ostensibly right it can still be wrong. I'm sure you know this, but this is how I'll break it down next time someone asks me to use an LLM to do maths)

[–] HexReplyBot@hexbear.net 2 points 6 hours ago* (last edited 6 hours ago)

I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy: