this post was submitted on 23 Nov 2025
37 points (82.5% liked)

Technology

77084 readers
4965 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

In tweaking its chatbot to appeal to more people, OpenAI made it riskier for some of them. Now the company has made its chatbot safer. Will that undermine its quest for growth?

top 6 comments
sorted by: hot top controversial new old
[–] Deestan@lemmy.world 41 points 4 days ago (3 children)

This reads like OpenAI's fanfic on what happened, retconning decisions they didn't make, things they didn't (couldn't!) do, and thought that didn't occur to them. All indicating that the possibility to be infinitely better is not only possible, but is right there for their taking.

For the one in April, engineers created many new versions of GPT-4o — all with slightly different recipes to make it better at science, coding and fuzzier traits, like intuition.

Citation needed.

OpenAI did not already have this test. An OpenAI competitor, Anthropic, the maker of Claude, had developed an evaluation for sycophancy

This reality does not exist: Claude is trying to lick my ass clean every time I ask it a simple question, and while sycophantic language can be toned down, the behavior of coming up with a believable positive answer for whatever the user has, is the foundational core of LLMs.

“We wanted to make sure the changes we shipped were endorsed by mental health experts,” Mr. Heidecke said.

As soon as they found experts who were willing to say something else than "don't make a chatbot". They now have a sycophantically motivated system with an ever growing list of sticky notes on its desk: "if sleep deprivation then alarm", "if suicide then alarm", "if ending life then alarm", "if stop living then alarm", hoping to have enough to catch the most obvious attempts.

The same M.I.T. lab that did the earlier study with OpenAI also found that the new model was significantly improved during conversations mimicking mental health crises.

The study was basically rigged: it used 18 known and identified crises chat logs from ChatGPT - meaning the set of stuff OpenAI just had hard coded "plz alarm" for, and thousands of "simulated mental health crises" generated by FUCKING LLMs meaning they only test if ChatGPT can identify mental health problems in texts where it had written its own understanding of what meantal health crisis looked like. For fucks sake of course it did perfectly in guessing its own card.

TLDR; bullshit damage control

[–] Aatube@lemmy.dbzer0.com 1 points 5 hours ago* (last edited 5 hours ago)

Citation needed.

This is a New York Times article. By default, the New York Times is the citation, just like every other MSM. And even then, this specific article does attribute it:

To understand how this happened, The New York Times interviewed more than 40 current and former OpenAI employees — executives, safety engineers, researchers. Some of these people spoke with the company’s approval, and have been working to make ChatGPT safer. Others spoke on the condition of anonymity because they feared losing their jobs.

Claude is trying to lick my ass clean every time I ask it a simple question

The article only said they made a test, not that they weren't failing it, which happens to be what the linked paper says. This is not new as LLMs also always failed a certain intelligence test devised around that same time period until ~2024.

As soon as they found experts who were willing to say something else than "don't make a chatbot".

That's 55%: https://humanfactors.jmir.org/2025/1/e71065

[–] panda_abyss@lemmy.ca 5 points 4 days ago

The hard coding here is basically fine tuning.

They generate a set of example cases and then paired prompt with good and bad responses. Then they update the model weights until it does well on those cases.

So they only do this with cases they’ve seen, and they can’t really say how well it does with cases they haven’t.

Having this in their fine tune dataset will juice the results, but also hopefully it actually identifies these issues correctly.

The other thing is a lot of the raw data in these systems is generated by cheap workers in third world countries who will not have a good appreciation for mental health.

[–] iopq@lemmy.world -5 points 3 days ago (1 children)

There are things chatbots are useful for. Like writing short scripts to automate some tasks. I had mostly ChatGPT write a Haskell script to enable the tproxy globally, write to a .env file for the other services to know the IP of the proxy and to restart on change

I also wrote a script to change the IP of my proxy and update the DNS record. The tproxy software uses the authoritative DNS server to initially look it up to avoid having to wait for the TTL to expire

Doing this by hand was annoying and error-prone

[–] Deestan@lemmy.world 4 points 3 days ago

Sir, this is a Wendy's

[–] Darkcoffee@sh.itjust.works 7 points 4 days ago

Ran to the bank?