Off My Chest
RULES:
I am looking for mods!
1. The "good" part of our community means we are pro-empathy and anti-harassment. However, we don't intend to make this a "safe space" where everyone has to be a saint. Sh*t happens, and life is messy. That's why we get things off our chests.
2. Bigotry is not allowed. That includes racism, sexism, ableism, homophobia, transphobia, xenophobia, and religiophobia. (If you want to vent about religion, that's fine; but religion is not inherently evil.)
3. Frustrated, venting, or angry posts are still welcome.
4. Posts and comments that bait, threaten, or incite harassment are not allowed.
5. If anyone offers mental, medical, or professional advice here, please remember to take it with a grain of salt. Seek out real professionals if needed.
6. Please put NSFW behind NSFW tags.
view the rest of the comments
I think the only way is through. I use to be quite bullish on ChatGPT (because when it works well, it works really well). It was only after it cost me (time, $$$) that I started treating AI adversarially. That's my default position now.
The issue with LLMs isn't that they lie badly (hallucinate), it's that they lie beautifully and then ask if they can help you with something else. They are memetically hazardous unless constrained.
Funnily enough, one of the benchmarks for LLMs is the bullshit-bench. It's showing that some of the SOTA (state of the art models) are being tuned out of confabulation. Time will tell.
https://petergpt.github.io/bullshit-benchmark/viewer/index.html
As I alluded to in another comment in this thread, the worst I've personally seen were procedures develeoped that would have had people entering areas that were not just hazardous, but incompatible with human life, and performing maintenance on fully energized industrial systems without safety constraints in place. Both cases would have caused fatalities if someone blindly followed the checklist as written. An internal review caught these mistakes, but they should have never made it that far.
The people designing the procedure checklists missed them possibly because, as you said, AI lies beautifully, but I think it was also because many people seem to have an inclinination is to trust it over their own judgements and knowledge. These were supervisors with years of direct experience, the red flags should have been instantly obvious. If they'd written it out by hand, the proper order of events would have been almost muscle memory, what made them so careless?
They claimed they just used AI to format and grammar check their work, and I don't have logs to prove or disporve that. But this is more than just a hallucination, it's a lack of reasoning similar to the car wash problem, but with much more severe consequences. TBH I'm not sure even adding specific knowledge of our equipment and facilities would fix it, let alone just a reduction in hallucinations.
On top of that, I've seen a long, long time trend of people who just will not take the time to read and understand the sum total of information needed to safely and correctly perform our work. It's a lot, but we do complicated and dangerous things. They've replaced knowing things with Googling them or searching through documents to find a possibly out of context quote. Failed safety and regulatory compliance inspections are far more common because people just don't know what they need to know despite having all that information at their finger tips. Nothing seems to be processed or retained, it's just sort of gawked at and repeated.
They aren't dumb. I work with them. I know them. It's not just stupidity and it's not just hallucinations. Our tools are using us, and it should always be the other way around. A tool that can't be used, in both the philosophical and literal sense, should be discarded.
I'm not trusting AI anytime soon, and I remain suspicious of everyone until they prove themselves to actually understand what's going on.
I'm willing to reconsider things as technology improves, but I wouldn't bet my 401k on LLMs being worth a shit anytime before I retire.
That sounds less like an AI problem and more like a process failure. Or a people being people problem.
In any safety-critical environment a procedure shouldn't be trusted just because it looks polished - it should be verified step by step by someone who understands the system.
If a checklist could send someone into an environment incompatible with human life and nobody noticed until review, the safety culture already had a hole in it.
AI might widen that hole, but it didn’t create it.
Tools that produce fluent output absolutely require skepticism, but that’s not unique to AI.
Aviation didn't discard autopilot because pilots could become complacent - they built procedures around it.
The same principle probably applies here.
You're right that we can't rule complacency and human error. And we have internal reviews precisely to account for complacency. Again, I'm intimately familiar with both the safety culture and people involved, this is an unusual and recent development. But I suppose asking you to take my word for it might strain credulity. That is what it is.
I'd be inclined to agree with you more if it weren't for how widespread the smaller issues are. The general trend, among the old and young, is less actual knowledge of the job and more reliance on quick access to information that often isn't applied properly in context. It existed before AI, and has gotten worse with it's introduction. Something about instant access to information seems to harm retention and application of that info. Pretty obvious trend for me, as part of my job is to ensure it's retained and applied properly.
Those procedures built around autopilot, along with other issues of flying more complicated modern aircraft, were dealt with by controlling how information flowed, how it was communicated, and the weight of authority it was given, often with human processes like Crew Resource Management. As I've said, the presentation of information absolutely changes how people understand and apply it. CRM helps because it prompts people to present information to each other in a way that facilitates better decision making and delegation in a crisis.
But autopilot has always been beneficial, right from the start it was obvious. It reduces pilot fatigue on long-haul flights and helps keep air traffic in the right place. Pilot complacency was never really a worry, but malfunctions were.
In the end, it's not that it can't be done. We could adjust our processes to include LLMs simply because people think they're neat. It's just that there's no compelling evidence that its better for distributing information, developing procedures, or teaching people how not to die.
Oh, I have no reason to doubt your word and lived experience.
And I don’t disagree with your broader point that presentation of information affects how people reason with it.
If you say "aviation learned that the hard way and built things like CRM around it", that seems plausible.
But that cuts both ways. If the problem is how information is presented and weighted, that suggests the solution is designing the workflow around the tool rather than assuming the tool itself is inherently harmful.
Most modern safety systems already assume humans won't retain everything and rely on structured procedures and cross-checks instead.
The real failure case you described wasn’t that AI existed , it was that obviously impossible steps made it far enough down the pipeline that review had to catch them.
I completely agree that instant access to information can degrade understanding if it replaces thinking.
But that trend started long before LLMs. The question isn’t whether the tool exists, it’s whether we build processes that force people to reason about the output instead of accepting it because it looks polished.
Right. We can't blame the existence or even the use of AI fully. But the way AI is often used, and the way my armchair studies of human nature tell me it will continue to be used, I think will lead to more events like this. The trend of easy access and low retention did indeed start before LLMs, but they don't seem to be a remedy for it from what I can tell. If anything, they're neutral, and I'd argue they make it worse.
We could (and frankly probably will need to because I doubt AI will be abandoned due to the sheer volume of cash that has been dumped into it), build processes to account for the failings of LLMs and the failings in how we use them. Or we could look at existing methods, those we understand and have learned to work effectively with, and reapply them as needed.
My bet is that LLMs and genAI will exacerbate the trend of being info rich and knowledge poor, and the processes we have to create in order to safely and effectively apply it are going to be more costly than any efficiency we get out of adopting it. I could be wrong, but I'd bet you a six-pack of whatever you drink that I'm not. Collectable in five years, if Lemmy hasn't been replaced by LLMmy by then. I'll even ship international if need be. =)
You sure you want to take that bet? I keep getting accused of being half clanker already
https://codeberg.org/BobbyLLM/llama-conductor/src/branch/main/meme-test.md
(see link at end, but also the article itself may be salient)
Yup. I'll take the bet.
After all, your expectation of the impact of AI is arguably the better outcome for humanity, isn't it? I'm expecting a sharp increase in horrific industrial accidents and the slow but steady regression of human intellect until we're all mindless drones from sector 7-C. =P
That's a good bet to lose.
Besides, actually paying out on oddball, five year old bets is the kind of thing that made the pre-social media, pre-AI internet great, and I miss that.
Oof. So little faith in your fellow man :) What are we - clankering towards Bethlehem?
I'd like to believe that as a species, we're more Peter Parker ("brilliant but lazy") than Joker ("some men just want to watch the world burn")....but I'll never discount the wonder that is human apathy.
OTOH...things are broadly better in 2026 then 1906...but also worse in some ways...hmm
Wait, which side am I betting for again? And is SMOD (Sweet meteor of death) in the running?
Guilty as charged. I often wonder what effect dealing with quality control and safety has on my mentality. Much like first responders see a lot of people at their worst, I see a lot of them at their dumbest and laziest.
I think we're still at a net gain over where we were in 1906, but that's subjective. Most of us live longer and more comfortable lives, but that could change if we're not careful, and I don't think we're being particularly careful in this decade. I'm a bit pessimistic, but I don't see it as a bad thing. Back on aviation, the old saying is that it takes an optimist to invent the airplane, and it takes a pessimist to invent the parachute.
I'd rather keep meteors out of it. Some of the planet is quite pretty and whatever species takes over for us might appreciate the view.