this post was submitted on 28 Aug 2025
22 points (89.3% liked)
Technology
40118 readers
269 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This seems like it pretty much sums things up from my experience.
We're encouraged (coughrequiredcough) to use LLMs at work. So I tried.
There are things they can do. Sometimes. But you know what they can't do? Be liable for a fuck up.
When I ask a coworker a question, if they confidently answer wrong, they fucked up, not me. When I ask a LLM? The LLM isn't liable, it's me for not verifying it. If I'm verifying anyway, why am I using the LLM?
They fuck up often enough that I can't put my credibility on the line over speedy slop. People at work consider me to be a good programmer (don't ask me how, I guess the bar is low lol). Imagine if my code was just whatever an LLM shat out. It'd be the same exact quality as all of my other coworkers who use whatever their LLM shat out. No difference in quality.
And we would all be liable when the LLMs fucked up. We would learn something. We would, not the LLM. And the LLM will make the same exact fuck up the next time.
I'm gonna take this comment, blow it up to poster size, and put it in my office, right in front of my webcam so I can watch my boss squint trying to read it.
Validating output should be much easier than generating it yourself. P≠NP.
This is especially true in contexts where the LLM provides citations. If the AI is good, then all you need to do is check the citations. (Most AI tools are shit, though; avoid any that can't provide good, accurate citations when applicable.)
Consider that all scientific papers go through peer review, and any decent-sized org will have regular code reviews as well.
From the perspective of a senior software engineer, validating code that could very well be ruinously bad is nothing new. Validation and testing is required whether it was written by an LLM or some dude who spent two weeks at a coding "boot camp".
This is very much not true in some domains, like software development. Code is much harder to read than it is to write, so verifying the output of a coding AI usually takes more time (or at least more cognitive effort) than if you'd just written the code yourself.
If the AI is writing ALL the code for an entire application it would be a problem, but as an assistant to a programmer, if it spits out a single line or even a small function, you can read it over very quickly to validate it before moving on to the next component.
This isn't how we're being asked to use it. People are doing demos about how Cursor or whatever did the bootstrapping and entire POC for them. And we already know there's nothing more permanent than a POC.
This is exactly how most developers are being asked to use it, it's literally how most of the IDE integrations work.
[citation needed]
At work, we get emails, demos, etc constantly about how they're using AI to generate everything from UI designs (v0) to starter projects and how they manage these huge prompts and reference docs for their agents.
Copilot's line-by-line suggestions are also being pushed, but they care more about the "agentic" stuff.
I watch coworkers regularly ask it to "add X route to the API" or "make a simple UI that calls Y API". They are asking it to do their work.
I have to review these PRs. They come in at an incredible rate, and almost always conflict with each other. I can't review them fast enough to still do my work.
Also, we get AI-generated code reviews at work. I have to talk to a chatbot to get help from HR. Some search bars have been replaced with chatbots. It's everywhere and I'm getting sick of it.
I just want real information from informed people. I want to review code that a human did their best to produce. I want to be able to help people improve their skills, not just their prompts.
I'm getting to the point where I'm going to start calling people out if their chatbot/agent/LLM/whatever produces slop. I'm going to give them ownership of it. It's their output, not the AI's.
Edit: I should add that it's a big company (100k+ employees)
Yeah, that's true for a subset of code. But for others, the hardest parts happen in the brain, not in the files. Writing readable code is very very important, especially when you are working with larger teams. Lots of people cut corners here and elsewhere in coding, though. Including, like, every startup I've ever seen.
There's a lot of gruntwork in coding, and LLMs are very good at the gruntwork. But coding is also an art and a science and they're not good at that at high levels (same with visual art and "real" science; think of the code equivalent of seven deformed fingers).
I don't mean to hand-wave the problems away. I know that people are going to push the limits far beyond reason, and I know it's going to lead to monumental fuckups. I know that because it's been true for my entire career.