this post was submitted on 22 Dec 2025
16 points (100.0% liked)

TechTakes

2346 readers
47 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS
 

Want to wade into the snowy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this. Merry Christmas, happy Hannukah, and happy holidays in general!)

you are viewing a single comment's thread
view the rest of the comments
[–] lagrangeinterpolator@awful.systems 12 points 1 week ago (15 children)

AI researchers are rapidly embracing AI reviews, with the new Stanford Agentic Reviewer. Surely nothing could possibly go wrong!

Here's the "tech overview" for their website.

Our agentic reviewer provides rapid feedback to researchers on their work to help them to rapidly iterate and improve their research.

The inspiration for this project was a conversation that one of us had with a student (not from Stanford) that had their research paper rejected 6 times over 3 years. They got a round of feedback roughly every 6 months from the peer review process, and this commentary formed the basis for their next round of revisions. The 6 month iteration cycle was painfully slow, and the noisy reviews — which were more focused on judging a paper's worth than providing constructive feedback — gave only a weak signal for where to go next.

How is it, when people try to argue about the magical benefits of AI on a task, it always comes down to arguing "well actually, humans suck at the task too! Look, humans make mistakes!" That seems to be the only way they can justify the fact that AI sucks. At least it spews garbage fast!

(Also, this is a little mean, but if someone's paper got rejected 6 times in a row, perhaps it's time to throw in the towel, accept that the project was never that good in the first place, and try better ideas. Not every idea works out, especially in research.)

When modified to output a 1-10 score by training to mimic ICLR 2025 reviews (which are public), we found that the Spearman correlation (higher is better) between one human reviewer and another is 0.41, whereas the correlation between AI and one human reviewer is 0.42. This suggests the agentic reviewer is approaching human-level performance.

Actually, now all my concerns are now completely gone. They found that one number is bigger than another number, so I take back all of my counterarguments. I now have full faith that this is going to work out.

Reviews are AI generated, and may contain errors.

We had built this for researchers seeking feedback on their work. If you are a reviewer for a conference, we discourage using this in any way that violates the policies of that conference.

Of course, we need the mandatory disclaimers that will definitely be enforced. No reviewer will ever be a lazy bum and use this AI for their actual conference reviews.

[–] blakestacey@awful.systems 8 points 1 week ago (1 children)

the noisy reviews — which were more focused on judging a paper’s worth than providing constructive feedback

dafuq?

[–] lagrangeinterpolator@awful.systems 7 points 1 week ago* (last edited 1 week ago)

Yeah, it's not like reviewers can just write "This paper is utter trash. Score: 2" unless ML is somehow an even worse field than I previously thought.

They referenced someone who had a paper get rejected from conferences six times, which to me is an indication that their idea just isn't that good. I don't mean this as a personal attack; everyone has bad ideas. It's just that at some point, you just have to cut your losses with a bad idea and instead use your time to develop better ideas.

So I am suspicious that when they say "constructive feedback", they don't mean "how do I make this idea good" but instead "what are the magic words that will get my paper accepted into a conference". ML has become a cutthroat publish-or-perish field, after all. It certainly won't help that LLMs are effectively trained to glaze the user at all times.

load more comments (13 replies)