this post was submitted on 15 Aug 2025
214 points (96.9% liked)

Technology

74067 readers
3191 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] humanspiral@lemmy.ca 1 points 36 minutes ago

I've done a test of 8 LLMs, on coding. It was using the J language, asking all of them to generate a chess "mate in x solver"

Even the bad models were good at organizing code, and had some understanding of chess, were good at understanding the ideas in their prompts. The bad models were bad mostly on logic. Not understanding indexing/amend on a table, not understanding proper function calling, or proper decomposition of arguments in J. Bad models included copilot and openAI's 120g open source model. kimi k2 was ok. Sonet 4 the best. I've mostly used Qwen 3 245 for better free accessibility than Sonet 4, and the fact that it has a giant context that makes it think harder (slower) and better the more its used on a problem. Qwen 3 did a good job in writing a fairly lengthy chess position scoring function, and then separating it into 2 quick and medium function, incorporating self written library code, and recommending enhancements.

There is a lot to get used to in working with LLMs, but the right ones, can generally help with code writting process. ie. there exists some code outputs which even when wrong, provide a faster path to objectives than if that code output did not exist. No matter how bad the code outputs, you are almost never dumber for having received it, unless perhaps you don't understand the language well enough to know its bad.

[–] TankovayaDiviziya@lemmy.world 2 points 3 hours ago

I don't work in IT, but I do know you need creativity to work in the industry, something which the current LLM/AI doesn't possess.

Linguists also dismiss LLMs in similar vein because LLMs can't grasp context. It is always funny to be sarcastic and ironic on an LLM.

Soft skills and culture are what that the current iteration of LLMs lack. However, I do think there is still huge potential for AI development in dacades to come, but I want this AI bubble to burst as "in your face" to companies.

[–] TuffNutzes@lemmy.world 78 points 1 day ago (3 children)

The LLM worship has to stop.

It's like saying a hammer can build a house. No, it can't.

It's useful to pound in nails and automate a lot of repetitive and boring tasks but it's not going to build the house for you - architect it, plan it, validate it.

It's similar to the whole 3D printing hype. You can 3D print a house! No you can't.

You can 3D print a wall, maybe a window.

Then have a skilled Craftsman put it all together for you, ensure fit and finish and essentially build the final product.

[–] frog_brawler@lemmy.world 2 points 2 hours ago (1 children)

You’re making a great analogy with the 3D printing of a house.

However, if we consider the 3D printed house scenario; that skilled craftsman is now able to do things on his own that he would have needed a team for in the past. Most, if not all, of the less skilled members of that team are not getting any experience within the craft at that point. They’re no longer necessary when one skilled person can now do things on their own.

What happens when the skilled and highly experienced craftsmen that use AI as a supplemental tool (and subsequently earn all the work) eventually retire, and there’s been no juniors or mid-levels for a while? No one is really going to be qualified without having had exposure to the trade for several years.

[–] TuffNutzes@lemmy.world 1 points 34 minutes ago

Absolutely. This is a huge problem and I've read about this very problem from a number of sources. This will have a huge impact on engineering and information work.

Interestingly enough, A similar shortage occurred in the trades when information work was up and coming and the trades were shunned as a career path for many. Now we don't have enough plumbers and electricians. Trades are now finding their the skills in high demand and charging very high rates.

[–] dreadbeef@lemmy.dbzer0.com -1 points 3 hours ago* (last edited 3 hours ago) (3 children)

3d printed concrete houses exist. Why can't you 3d print a house? Not the best metaphor lol

[–] surewhynotlem@lemmy.world 1 points 14 minutes ago

You don't like glass windows? Air conditioning? A door?

[–] toddestan@lemmy.world 3 points 1 hour ago* (last edited 1 hour ago) (1 children)

You can certainly 3D print a building, but can you really 3D print a house? Can it 3d print doors and windows that can open and close and be locked? Can it 3D print the plumbing and wiring and have it be safe and functional? Can it 3D print the foundation? What about bathroom fixtures, kitchen cabinets, and things like carpet?

It's actually not a bad metaphor. You can use a 3D printer to help with building a house, and to 3D print some of fixtures and bits and pieces that go into the house. Using a 3D printer would automate a fair amount of the manual labor that goes into building a house today (at least how it is done in the US). But you're still going to need people who know what they are doing put it all together to transform the building to a functional home. We're still a fair ways away from just being able to 3D print a house, just like we're fair ways away from having a LLM write a large, complex piece of software.

[–] TuffNutzes@lemmy.world 1 points 33 minutes ago

Exactly this.

[–] Nalivai@lemmy.world 2 points 1 hour ago

No they aren't. With enough setup and very unique and expensive equipment, you can pour shitty concrete walls that will be way more expensive and worse than if you did it normally. That will give you 20% of the house, at best. 20% of not very good of a house.

[–] natecox@programming.dev 8 points 1 day ago (1 children)

I hate the simulated intelligence nonsense at least as much as you, but you should probably know about this if you’re saying you can’t 3d print a house: https://youtu.be/vL2KoMNzGTo

[–] TuffNutzes@lemmy.world 28 points 23 hours ago (6 children)

Yeah I've seen that before and it's basically what I'm talking about. Again, that's not "printing a 3D house" as hype would lead one to believe. Is it extruding cement to build the walls around very carefully placed framing and heavily managed and coordinated by people and finished with plumbing, electrical, etc.

It's cool that they can bring this huge piece of equipment to extrude cement to form some kind of wall. It's a neat proof of concept. I personally wouldn't want to live in a house that looked anything like or was constructed that way. Would you?

[–] scarabic@lemmy.world -2 points 10 hours ago* (last edited 10 hours ago)

it's basically what I'm talking about

Well, a minute ago you were saying that AI worship is akin to saying

a hammer can build a house

Now you’re saying that a hammer is basically the same thing as a machine that can create a building frame unattended? Come on. You have a point to be made here but you’re leaning on the stick a bit too hard.

load more comments (5 replies)
[–] black_flag@lemmy.dbzer0.com 104 points 1 day ago (1 children)

I think it's going to require a change in how models are built and optimized. Software engineering requires models that can do more than just generate code.

You mean to tell me that language models aren't intelligent? But that would mean all these people cramming LLMs in places where intelligence is needed are wasting their time?? Who knew?

Me.

[–] eager_eagle@lemmy.world 34 points 1 day ago (1 children)

I have a solution for that, I just need a small loan of a billion dollars and 5 years. #trustmebro

[–] black_flag@lemmy.dbzer0.com 12 points 1 day ago

Only one billion?? What a deal! Where's my checkbook!?

[–] isaaclyman@lemmy.world 22 points 1 day ago (12 children)

Clearly LLMs are useful to software engineers.

Citation needed. I don’t use one. If my coworkers do, they’re very quiet about it. More than half the posts I see promoting them, even as “just a tool,” are from people with obvious conflicts of interest. What’s “clear” to me is that the Overton window has been dragged kicking and screaming to the extreme end of the scale by five years of constant press releases masquerading as news and billions of dollars of market speculation.

I’m not going to delegate the easiest part of my job to something that’s undeniably worse at it. I’m not going to pass up opportunities to understand a system better in hopes of getting 30-minute tasks done in 10. And I’m definitely not going to pay for the privilege.

[–] frog_brawler@lemmy.world 2 points 2 hours ago* (last edited 2 hours ago)

I’m not a “software engineer” but a lot of people that don’t work within tech would probably call me one.

I’m in Cloud Engineering, but came from the sys/network admin and ops side of things rather than starting off in dev or anything like that.

Up until about 5 years ago, I really only knew Powershell and a little bit of bash. I’ve gotten up to speed in a lot of things but never officially learned python, js, go or any other real development language that would be useful to me. I’ve spent way more time focusing on getting good with IaC, and probably more of the SRE type stuff.

In my particular situation, LLMs are incredibly useful. It’s fair to say that I use them daily now. I’ve had it convert bash scripts to python for me very quickly. I don’t know python but now that I’m able to look at a python script next to my bash; I’m picking up on stuff a lot faster. I’m using Lambda way more often as a result.

Also, there’s a lot of mundane filling out forms shit that I delegate to an LLM. I don’t want to spend my time filling out a form that I know no one is actually going to read. F it, I’ll have the AI write a report for an AI. It’s dumb as shit, but that’s the world today.

[–] Phegan@lemmy.world 2 points 3 hours ago

I've only found two effective uses for them. Every time I tried them otherwise they fell flat and took me longer that it would have to write the code myself.

The first was a greenfield personal project where I let code quality wane since I was the only person maintaining it, and wanted to test LLMs. The other was to write highly repeative data tests where the model can simply type faster than me.

Anything that requires writing code that needs to be maintained by multiple people or systems older than 2 years, it has fallen completely flat. In cases like that I spend more time telling the LLM it is doing it wrong, it would have taken me less time to write the code in the first place. In 95% of cases, I am still faster than an LLM at solving a problem and writing the code.

[–] skisnow@lemmy.ca 5 points 9 hours ago

I've found them useful, sometimes, but nothing like a fraction of what the hype would suggest.

They're not adequate replacements for code reviewers, but getting an AI code review does let me occasionally fix a couple of blunders before I waste another human's time with them.

I've also had the occasional bit of luck with "why am I getting this error" questions, where it saved me 10 minutes of digging through the code myself.

"Create some test data and a smoke test for this feature" is another good timesaver for what would normally be very tedious drudge work.

What I have given up on is "implement a feature that does X" questions, because it invariably creates more work than it saves. Companies selling "type in your app idea and it'll write the code" solutions are snake-oil salesman.

[–] jj4211@lemmy.world 2 points 14 hours ago

I have been using it a bit, still can't decide if it is useful or not though... It can occasionally suggest a blatantly obvious couple of lines of code here and there, but along the way I get inundated with annoying suggestions that are useless and I haven't gotten used to ignoring them.

I mostly work with a niche area the LLMs seem broadly clueless about, and prompt driven code is almost always useless except when dealing with a super boilerplate usage of a common library.

I do know some people that deal with amazingly mundane and common functions and they are amazed that it can pretty much do their jobs, but they never really impressed me before anyway and I wondered how they had a job...

[–] Feyd@programming.dev 12 points 21 hours ago

I don't use one, and my coworkers that do use them are very loud about it, and worse at their jobs than they were a year ago.

load more comments (7 replies)
[–] frezik@lemmy.blahaj.zone 32 points 1 day ago (10 children)

To those who have played around with LLM code generation more than me, how are they at debugging?

I'm thinking of Kernighan's Law: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." If vibe coding reduces the complexity of writing code by 10x, but debugging remains just as difficult as before, then Kernighan's Law needs to be updated to say debugging is 20x as hard as vibe coding. Vibe coders have no hope of bridging that gap.

[–] frog_brawler@lemmy.world 1 points 2 hours ago

How are they at debugging? In a silo, they’re shit.

I’ve been using one LLM to debug the other this past week for a personal project, and it can be a bit tedious sometimes, but it eventually does a decent enough job. I’m pretty much vibe coding things that are a bit out of my immediate knowledge and skill set, but I know how they’re supposed to work. For example, I’ve got some python scripts using rekognition to scan photos for porn or other explicit stuff before they get sent to an s3 bucket. After that happens, there’s now a dashboard that’s going to give me results on how many images were scanned and then marked as either acceptable or flagged as inappropriate. After a threshold of too many inappropriate images being sent in, it’ll shadowban them from sending any more dick pics in.

For someone that’s never taken a coding course, I’m relatively happy with the results I’m getting so far. Granted, this may be small potatoes for someone with an actual development background; but as someone that’s been working adjacent to those folks for several years, I’m happy with the output.

[–] very_well_lost@lemmy.world 16 points 1 day ago* (last edited 1 day ago) (3 children)

The company I work for has recently mandated that we must start using AI tools in our workflow and is tracking our usage, so I've been experimenting with it a lot lately.

In my experience, it's worse than useless when it comes to debugging code. The class of errors that it can solve is generally simple stuff like typos and syntax errors — the sort of thing that a human would solve in 30 seconds by looking at a stack trace. The much more important class of problem, errors in the business logic, it really really sucks at solving.

For those problems, it very confidently identifies the wrong answer about 95% of the time. And if you're a dev who's desperate enough to ask AI for help debugging something, you probably don't know what's wrong either, so it won't be immediately clear if the AI just gave you garbage or if its suggestion has any real merit. So you go check and manually confirm that the LLM is full of shit which costs you time... then you go back to the LLM with more context and ask it to try again. It's second suggestion will sound even more confident than the first, ("Aha! I see the real cause of the issue now!") but it will still be nonsense. You go waste more time to rule out the second suggestion, then go back to the AI to scold it for being wrong again.

Rinse and repeat this cycle enough times until your manager is happy you've hit the desired usage metrics, then go open your debugging tool of choice and do the actual work.

[–] HarkMahlberg@kbin.earth 9 points 23 hours ago (1 children)

we must start using AI tools in our workflow and is tracking our usage

Reads to me as "Please help us justify the very expensive license we just purchased and all the talented engineers we just laid off."

I know the pain. Leadership's desperation is so thick you can smell it. They got FOMO'd, now they're humiliated, so they start lashing out.

[–] frog_brawler@lemmy.world 2 points 2 hours ago* (last edited 2 hours ago)

Funny enough, the AI shift is really just covering for the over-hiring mistakes in 2021. They can’t admit they fucked up in hiring too many people during Covid, so they’re using AI as the scapegoat. We all know it’s not able to actually replace people yet; but that’s happening anyway.

There won’t be any immediate ramifications, we’ll start to see that in probably 12-18 months or so. It’s just another form of kicking the can down the road.

[–] HubertManne@piefed.social 9 points 1 day ago (1 children)

maybe its just me but I find typos to be the most difficult because my brain and easily see it as correct so the whole code looks correct. Its like the way you can take the vowels out of sentences and people can still immediately read it.

[–] ganryuu@lemmy.ca 1 points 3 hours ago

Probably why they talked about looking at a stack trace, you'll see immediately that you made a typo in a variable's name or language keyword when compiling or executing.

[–] trublu@lemmy.dbzer0.com 7 points 23 hours ago

As it seems to be the case in all of these situations, AI fails hard at tasks when compared to tools specifically designed for that task. I use Ruff in all my Python projects because it formats my code and finds (and often fixes) the kind of low complexity/high probability problems that are likely to pop up as a result of human imperfection. It does it with great accuracy, incredible speed, using very little computing resources, and provides levels of safety in automating fixes. I can run it as an automation step when someone proposes code changes, adding all of 3 or 4 seconds to the runtime. I can run it on my local machine to instantly resolve my ID10T errors. If AI can't solve these problems as quickly, and if it can't solve anything more complicated reliably, I don't understand why it would be a tool I would use.

[–] Ledivin@lemmy.world 24 points 1 day ago (1 children)

They're not good at debugging. The article is pretty spot on, IMO - they're great at doing the work; but you are still the brain. You're still deciding what to do, and maybe 50% of the time how to do it, you're just not executing the lowest level anymore. Similar for debugging - this is not an exercise at the lowest level, and needs you to run it.

load more comments (1 replies)
[–] Pechente@feddit.org 14 points 1 day ago (1 children)

Definitely not good. Sometimes they can solve issues but you gotta point them in the direction of the issue. Other times they write hacky workarounds that do the job for the moment but crash catastrophically with the next major dependency update.

[–] HarkMahlberg@kbin.earth 12 points 1 day ago (13 children)

I saw an LLM override the casting operator in C#. An evangelist would say "genius! what a novel solution!" I said "nobody at this company is going to know what this code is doing 6 months from now."

It didn't even solve our problem.

load more comments (13 replies)
load more comments (6 replies)
load more comments
view more: next ›