Want to wade into the snowy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid.
Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.
Any awful.systems sub may be subsneered in this subthread, techtakes or no.
If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.
The post Xitter web has spawned so many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)
Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.
(Credit and/or blame to David Gerard for starting this. Also, hope you had a wonderful Valentine's Day!)
Tante.cc writes about Cory using an 'Drunk Uncle' style argument to defend his LLM usage (and go after the left using strawmans).
(To counter one of Cory's arguments, If disliking LLMs was just about the people who run it, people against it would have have stayed in sneerclub).
That was a good read.
Corey doc wrote:
Equivocating what LLMs do and what goes into LLM web scraping with "a search engine" is messed up. His article that he links about scraping is mostly about how badly copyright works and how analysing trade-secret-walled data can be beneficial both to consumers and science but occasionally bad for citizen privacy, which you'll recognize as mostly irrelevant to the concerns people tend to have against LLM training data providers ddosing the fuck out of everything.
Corey also provides this anecdote:
what the actual shit
I was a bit alarmed by this, a client brought in that Colombia data for their dissertation last month, and did not mention this. I looked up the paper https://www.arxiv.org/abs/2509.04523 - what they /actually/ did was use GPT 4o-mini only for feature extraction, then stack into a random forest in a supervised setting to dedupe. This is very different than what he described. And the GPT features weren't even the most important ones, the RF preferred cosine similarity of articles, a decidedly not-large approach...
That he went from that all the way to it's mostly ok when sam altman steals all your data, misrepresents it and then steals all your traffic is... bad.
At any rate it's definitely good to know that that war crime forensics data project isn't quite the unintentional shambles corey makes it out to be.