MangoCats

joined 1 year ago
[–] MangoCats@feddit.it 2 points 7 hours ago

Gemini seems more willing to just tell me what it “thinks” the answer to a question is based off of its training data, which is not a particularly reliable thing for an LLM to do.

Yeah. I pay for Claude, my company pays even more for Cursor, so comparing them to free Gemini probably isn't fair.

Gemini is very useful for offhand queries while Claude is chewing on a bigger problem, but if it's something that needs complex analysis and/or extensive research... the tools that let you build up a folder full of files related to the task are vastly superior to chatbots. Gemini does have a Claude Code command line tool that does that kind of development in a folder, I didn't install it until last week. Gave it a coding problem to work on (lookup realtime weather radar data from NOAA, present recent data on a map on a webpage)... it sort of succeeded, but with poor user experience. Again, I'm in "Free mode" which can do quite a bit on a day's allowance of tokens, but... I don't feel like their paid modes would be particularly higher quality. If they are, they're doing themselves a tremendous disservice by demoing such substandard performance in free mode.

[–] MangoCats@feddit.it 1 points 7 hours ago (1 children)

I use Cursor for work (Claude Code at home), and Cursor gives the option to select your model. I've dabbled a bit with GPT for the review of Claude code - haven't found anything dramatically better doing that than just Claude prompted to "wear the reviewer hat now."

[–] MangoCats@feddit.it 1 points 7 hours ago

The vulnerabilities were always there, one of the better uses of AI has been to find them.

[–] MangoCats@feddit.it 2 points 7 hours ago

I find that I get the best results when I develop a suite of documents in parallel with the code: requirements, architecture, designs, lessons learned, indexes into those documents, traceable ID tags on atomic, testable item descriptions. Development plans. When a new agent is introduced to the project, it can "get up to speed quickly" by jumping to the current working point on the development plan and indexing into all the relevant details in the other documents before even starting to read the existing code.

That working method itself is evolving, and each new LLM driven project builds on the previous successful projects' processes...

[–] MangoCats@feddit.it 2 points 7 hours ago

There was a time when nobody wrote unit tests, not so long ago, really.

[–] MangoCats@feddit.it 2 points 7 hours ago (1 children)

WTF are you expecting Claude to code in bash?

I have found Sonnet and Opus to both be very capable in bash, but then, I don't usually ask bash to do super-complex things - its syntax is just too screwy to think about big applications in it.

I will say, you might be misguiding the LLM by filling it full of bad examples before starting. Kind of like the advice about not staring at a tree downslope while skiing, if you're fixated on it you're MORE likely to hit it.

[–] MangoCats@feddit.it 2 points 7 hours ago

I couldnt possibly deploy with any confidence a large project or honestly a small project I expected someone to rely on without layers of test.

In my world, that depends just about entirely upon how "dynamic" the code base is expected to be after release. We send a lot of things into the field, thousands of copies used for important work, which we pretty much know certain aspects of the system are unlikely to be changed once released. Others are very likely to be changed. "Back in the day" we'd make reasoned judgement calls about which ones would benefit from the effort of unit / integration testing and which ones that effort would be better invested elsewhere. As time marches on, our procedures and cross-departmental "advisors" who aren't so cozy with the code are relentlessly pushing for more and more automated testing. It is safer, no argument, but it also delays launch - sometimes without added value IMO.

[–] MangoCats@feddit.it 2 points 7 hours ago

The hassle is all on the agent, not on me.

So much this. That hassle on the agent, a few minutes of me waiting for it to crunch out the unit tests, saves me tons of hassle later - not going in circles re-fixing problems that were fixed before.

Same for keeping implementation code and documentation in sync - I've got hundreds of out-of-date wiki pages that simply aren't worth my time to fix. But when it's the agent keeping the docs in sync, just tell it to do it and wait a few minutes - totally worth the effort.

[–] MangoCats@feddit.it 1 points 7 hours ago

After I worked with AI agents a little, I dove in with a big set of coding standards and practices and... I overdid it. I find I get better results by starting off with a "light touch" and letting it do what it wants, then correcting where it gets off track (like using python for something that needs efficient performance...)

[–] MangoCats@feddit.it 1 points 7 hours ago

I've been using it rather heavily since about October of last year, I definitely do notice the models getting better, the tools around the models starting to do some things automatically that I had to manually prompt for last year (especially remembering key instructions). I also believe I am getting better at using them, how much that contributes to my overall results is extremely hard to quantify, but the feeling is definitely there. Like - last October I used to "just ask" for things without having a documented set of requirements. Today, I just know that the requirements document is necessary when the level of complexity is above... well, above a one-off simple example of how to do something relatively trivial.

[–] MangoCats@feddit.it 1 points 7 hours ago

using the right tools and giving them the right instructions.

The right tools is definitely key. Back an eternity ago, like October 2025, there was only Claude IMO if you wanted anything bigger than about a page of code. The others have come a long way - better than Claude was then, and I still feel like Claude is out in front, though by a less dramatic margin now.

As for "the right instructions" - I'd say it's more of "use the right process" which basically involves applying all those best practices that have developed over the past decades for human development, but we old farts from back before their time "don't need all that, it's a waste of time" because, basically, we internally practice most of the discipline without doing the documentation. With the AI tools: document your requirements, your architecture, tool choice selection process, designs, development plan, comment the code with traceability to why the code is being written, unit and integration tests, reviews, lessons learned, etc. etc. Having all that documentation kept with the project, well organized, is key to "bringing the AI agent up to speed" which you may be doing often. They really do demonstrate the eternal sunshine of the spotless mind, so if you have them take the time to write everything relevant down as they go (not just the code), then when a new one comes online it can jump into the middle of a development plan without repeating (as many) mistakes / making (as many) bad assumptions.

To be brutally honest, working with AI coding agents reminds me a LOT of working with overseas programmer consultants - if you don't get everything in writing you're gonna have a bad time.

[–] MangoCats@feddit.it 0 points 8 hours ago

In the late 1980s there was a time where we seriously weighed the option of hand assembly vs using compilers and hand assembly didn't always lose. In the early 1990s I wanted to use C++ but the available compiler for IBM compatible PCs was too buggy to be of value.

By the mid 1990s that had changed, good C compilers were exceeding all but the highest effort human assembly code - if you didn't like how it looked in assembly, you could much more easily "fix it" with a tweak to the C code instead of the assembly. I feel like we're sort of getting there with AI agent LLMs today - if you don't like what it provided, tell it why and let it try again - it's usually faster and easier and gets a better product for the time invested to use the tool instead of calling it a slop box and doing it yourself.

 

What is this recurring connection between big missiles sending men to the moon and the military industrial complex sending expeditionary forces overseas?

 

996: 9 a.m. to 9 p.m., 6 days a week

Sure, they're burnt out, sluggish, surly, but... they're present. And when they're present, they're not out in the world spending their income. They don't need an expensive apartment or house, all they do there is sleep. Why have a fancy car when all you do is drive to/from your shitty job in it? Family? Who would have children with somebody who works such a schedule?

Even if you got more productivity from the same workers on a 9 a.m. to 3 p.m., 4 days a week schedule, you'd have to pay them more, not just per hour but overall, because they'd be out spending money on those afternoons / evenings and 3 days a week they have off. Organizing, demanding better healthcare, dental, more paid time off for vacations, and higher total wages to support all these "needs" they invent for themselves on their time off.

Keep 'em locked down, keep 'em tasked with ... anything, doesn't matter if it's productive or not, as long as it keeps them on-the-job and not spending their pay.

Edit: apparently this isn't clear: 996 is a horrible idea from all perspectives, it's bad for the workers and bad for their employers overall. But, in certain twisted views, it would be a bit like military service where the (bulk of the) workers get a pitifully small paycheck, but they don't have any real expenses so they have the option to save it all. 996 would turn that more into a wage-slave implementation where the pitifully small paycheck is just enough to meet their pitifully small expenses. In the China tech sector where they have implemented this (it is now illegal, but still practiced) they also do things like install anti-suicide nets in the stairwells of the highrises the workers work and sleep in.

 

cross-posted from: https://lemmy.sdf.org/post/31879711

cross-posted from: https://slrpnk.net/post/20187958

A prominent computer scientist who has spent 20 years publishing academic papers on cryptography, privacy, and cybersecurity has gone incommunicado, had his professor profile, email account, and phone number removed by his employer Indiana University, and had his homes raided by the FBI. No one knows why.

Xiaofeng Wang has a long list of prestigious titles. He was the associate dean for research at Indiana University's Luddy School of Informatics, Computing and Engineering, a fellow at the Institute of Electrical and Electronics Engineers and the American Association for the Advancement of Science, and a tenured professor at Indiana University at Bloomington. According to his employer, he has served as principal investigator on research projects totaling nearly $23 million over his 21 years there.

He has also co-authored scores of academic papers on a diverse range of research fields, including cryptography, systems security, and data privacy, including the protection of human genomic data. I have personally spoken to him on three occasions for articles herehere, and here.

"None of this is in any way normal"

In recent weeks, Wang's email account, phone number, and profile page at the Luddy School were quietly erased by his employer. Over the same time, Indiana University also removed a profile for his wife, Nianli Ma, who was listed as a Lead Systems Analyst and Programmer at the university's Library Technologies division.

According to the Herald-Times in Bloomington, a small fleet of unmarked cars driven by government agents descended on the Bloomington home of Wang and Ma on Friday. They spent most of the day going in and out of the house and occasionally transferred boxes from their vehicles. TV station WTHR, meanwhile, reported that a second home owned by Wang and Ma and located in Carmel, Indiana, was also searched. The station said that both a resident and an attorney for the resident were on scene during at least part of the search.

Attempts to locate Wang and Ma have so far been unsuccessful. An Indiana University spokesman didn't answer emailed questions asking if the couple was still employed by the university and why their profile pages, email addresses and phone numbers had been removed. The spokesman provided the contact information for a spokeswoman at the FBI's field office in Indianapolis. In an email, the spokeswoman wrote: "The FBI conducted court authorized law enforcement activity at homes in Bloomington and Carmel Friday. We have no further comment at this time."

Searches of federal court dockets turned up no documents related to Wang, Ma, or any searches of their residences. The FBI spokeswoman didn't answer questions seeking which US district court issued the warrant and when, and whether either Wang or Ma is being detained by authorities. Justice Department representatives didn't return an email seeking the same information. An email sent to a personal email address belonging to Wang went unanswered at the time this post went live. Their resident status (e.g. US citizens or green card holders) is currently unknown.

Fellow researchers took to social media over the weekend to register their concern over the series of events.

"None of this is in any way normal," Matthew Green, a professor specializing in cryptography at Johns Hopkins University, wrote on Mastodon. He continued: "Has anyone been in contact? I hear he’s been missing for two weeks and his students can’t reach him. How does this not get noticed for two weeks???"

In the same thread, Matt Blaze, a McDevitt Professor of Computer Science and Law at Georgetown University said: "It's hard to imagine what reason there could be for the university to scrub its website as if he never worked there. And while there's a process for removing tenured faculty, it takes more than an afternoon to do it."

Local news outlets reported the agents spent several hours moving boxes in an out of the residences. WTHR provided the following details about the raid on the Carmel home:

Neighbors say the agents announced "FBI, come out!" over a megaphone.

A woman came out of the house holding a phone. A video from a neighbor shows an agent taking that phone from her. She was then questioned in the driveway before agents began searching the home, collecting evidence and taking photos.

A car was pulled out of the garage slightly to allow investigators to access the attic.

The woman left the house before 13News arrived. She returned just after noon accompanied by a lawyer. The group of ten or so investigators left a few minutes later.

The FBI would not say what they were looking for or who is under investigation. A bureau spokesperson issued a statement: “I can confirm we conducted court-authorized activity at the address in Carmel today. We have no further comment at this time.”

Investigators were at the house for about four hours before leaving with several boxes of evidence. 13News rang the doorbell when the agents were gone. A lawyer representing the family who answered the door told us they're not sure yet what the investigation is about.

This post will be updated if new details become available. Anyone with first-hand knowledge of events involving Wang, Ma, or the investigation into either is encouraged to contact me, preferably over Signal at DanArs.82. The email address is: dan.goodin@arstechnica.com.

view more: next ›