this post was submitted on 14 May 2026
236 points (95.7% liked)
Technology
84699 readers
5666 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I have started using LLM tools recently after taking a new job where a lot of people do it. I've discovered that it's actually fairly helpful not only for explanations, but in two other respects
However all that said, I am honestly pretty impressed how well it works. I've mostly been using Claude, and damn, it's honestly pretty competent. I had it make me a helper Python GUI program for me to test some stuff (I'm not a UI/high level engineer like that, I'm an FPGA Engineer), and it did a decent job. It definitely needed a good amount of massaging and guidance. However I can definitely see the appeal, and I think it's a slippery slope, and I need to make sure I remain disciplined in not letting it do everything
One trap is to trust it as a means to accomodate unreasonable schedule pressure.
Sure - this thing looks like it works, hell it probably does work, do you really want to launch a probably works product? If your management does - consider shopping around for a raise/promotion under different management. It's never easy to move, but if you're moving on your own terms you can often make the effort worth your while.
Another note: I find the LLMs to be wickedly detail oriented code reviewers - like, they'll point out the tiniest little discrepancies and edge cases, and what they (Claude, at least) report is usually "real." Now, that doesn't mean they find everything that's wrong on the first pass, but once you've addressed everything in the first pass, you can make a second pass, and a third, etc. each time with different focus: documentation complete? implementation functions as intended? technical debt? test coverage? security issues? issues with maintainability? documentation in sync with implementation? specific aspect of implementation functions as intended? etc. - if you address all the findings after each review cycle (and addressing a finding can be clarifying a requirement to relax about certain unimportant aspects...) eventually the findings slow down / only find ridiculously unimportant things.
A thing I found quite amusing about the AI agents I've toyed with is that they have a step where they do a code review of their changelist, usually switching to a different "persona" when they write it so that they're not seeing it as "their own" code. It's funny reading at the critiques and compliments it gives the "other agent" it's checking the changes for.
I haven't seen this feature yet, but it might be a good future enhancement to ensure that the harness literally uses a different model for the code review from the one that wrote the code in the first place. If Claude wrote the code have GPT do the review, and vice versa, for example. Wouldn't be surprised if the feature exists and I just haven't spotted it yet though, things change fast.
I use Cursor for work (Claude Code at home), and Cursor gives the option to select your model. I've dabbled a bit with GPT for the review of Claude code - haven't found anything dramatically better doing that than just Claude prompted to "wear the reviewer hat now."
Yeah, I wouldn't use a framework that didn't let you select the basic model. I'm just thinking about having it automatically switch to a different one during the review "phase". It's not as popular a coding agent these days but I like using Google's Antigravity and it's capable of being told to go through the sequence of steps "plan - > write documentation -> implement the plan -> run unit tests -> do a code review" automatically without needing to be prompted at each step. That's where it would be nice to have it automatically switch for the review.
"Wear the reviewer hat now" does seem to work quite well with the same model, but if more models from different lineages are available it just seems like the right thing to do to switch to another one.