this post was submitted on 28 Aug 2025

86 points (100.0% liked)

Fuck AI

4811 readers

1643 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.

founded 2 years ago

MODERATORS

VerbFlow@lemmy.world

MrMcGasion@lemmy.world

TootSweet@lemmy.world

BigMikeInAustin@lemmy.world

cynar@lemmy.world

drmeanfeel@lemmy.world

pavnilschanda@lemmy.world

CriticalMedicine@lemmy.world

WonderfulWanderer@lemmy.world

Communist@lemmy.ml

eatCasserole@lemmy.world

SpaceNoodle@lemmy.world

NutWrench@lemmy.world

Soup@lemmy.cafe

iAvicenna@lemmy.world

Tinks@lemmy.world

wizblizz@lemmy.world

corus_kt@lemmy.world

Prandom_returns@lemm.ee

JimSamtanko@lemm.ee

TrickDacy@lemmy.world

TheFriar@lemm.ee

ArmokGoB@lemmy.dbzer0.com

HawlSera@lemm.ee

andrew_bidlaw@sh.itjust.works

MeDuViNoX@sh.itjust.works

33550336@lemmy.world

Nougat@fedia.io

Lost_My_Mind@lemmy.world

Sterile_Technique@lemmy.world

Quill7513@slrpnk.net

glowing_hans@sopuli.xyz

e8d79@discuss.tchncs.de

ThefuzzyFurryComrade@pawb.social

86

New Paper Finds Cases of "AI Psychosis" Manifesting Differently From Schizophrenia (futurism.com)

submitted 3 months ago by ThefuzzyFurryComrade@pawb.social to c/fuck_ai@lemmy.world

19 comments fedilink hide all child comments

all 21 comments

sorted by: hot top controversial new old

[–] sigmaklimgrindset@sopuli.xyz 27 points 3 months ago* (last edited 3 months ago) (5 children)

Ngl as a former clinical researcher putting aside my ethics concerns, I am extremely interested in the data we'll be getting regarding AI usage in groups over the next decades re: social behaviours, but also biological structural changes. Right now the sample sizes are way too small.

But more importantly, can anyone who has experience in LLMs explain why this happens:

Adding to the concerns, chatbots have persistently broken their own guardrails, giving dangerous advice on how to build bombs or on how to self-harm, even to users who identified as minors. Leading chatbots have even encouraged suicide to users who expressed a desire to take their own life.

How exactly are guardrails programmed into these chatbots, and why are they so easily circumvented? We're already on GPT-5, you would think this is something that would be solved? Why is ChatGPT giving instructions on how to assassinate it's own CEO?

[–] fullsquare@awful.systems 24 points 3 months ago* (last edited 3 months ago) (4 children)

commercial chatbots have a thing called system prompt. it's a slab of text that is fed before user's prompt and includes all the guidance on how chatbot is supposed to operate. it can get quite elaborate. (it's not recomputed every time user starts new chat, state of model is cached after ingesting system prompt, so it's only done when it changes)

if you think that's just telling chatbot to not do a specific thing is incredibly clunky and half-assed way to do it, you'd be correct. first, it's not a deterministic machine so you can't even be 100% sure that this info is followed in the first place. second, more attention is given to the last bits of input, so as chat goes on, the first bits get less important, and that includes these guardrails. sometimes there was a keyword-based filtering, but it doesn't seem like it is the case anymore. the more correct way of sanitizing output would be filtering training data for harmful content, but it's too slow and expensive and not disruptive enough and you can't hammer some random blog every 6 hours this way

there's a myriad ways of circumventing these guardrails, like roleplaying a character that does these supposedly guardrailed things, "it's for a story" or "tell me what are these horrible piracy sites so that i can avoid them" and so on and so on

[–] MountingSuspicion@reddthat.com 6 points 3 months ago

"Claude does not claim that it does not have subjective experiences, sentience, emotions, and so on in the way humans do. Instead, it engages with philosophical questions about AI intelligently and thoughtfully."

It says a similar thing 2 more times. It also gives conflicting instructions regarding what to do when asked about topics requiring licensed professionals. Thank you for the link.

[–] Meron35@lemmy.world 3 points 3 months ago

The system prompt guardrail is so jank that people run competitions and games to to beat them every time a new LLM comes out. Usually you see people beating guardrails hours within release.

Other keywords to search include prompt injection.

Gandalf | Lakera – Test your AI hacking skills - https://gandalf.lakera.ai/adventure-8

[–] sigmaklimgrindset@sopuli.xyz 2 points 3 months ago

second, more attention is given to the last bits of input, so as chat goes on, the first bits get less important, and that includes these guardrails

This part is something that I really can't grasp for some reason. Why do LLMs like...lose context the longer a chat goes on, if that makes any sense? Especially context that's baked into the system prompts, which I would would be a perpetual thing?

I'm sorry if this is a stupid question, but I truly am an AI luddite. My roomate set up a local Deepseek server to help me determine what to cook with what's almost expired our fridge. I'm not really having long, soulful conversations with it, you know?

[–] shalafi@lemmy.world 2 points 3 months ago

more attention is given to the last bits of input

This is what I'm screaming! Chat bots don't start the conversation with crazy shit, very rarely anyway. You have to keep going a bit to manipulate them into saying what you want to hear.

[+] MotoAsh@lemmy.world 13 points 3 months ago* (last edited 2 months ago) (1 children)

[deleted]

[–] sigmaklimgrindset@sopuli.xyz 4 points 3 months ago (2 children)

You laid it out so well, wow.

They are so easily circumvented because there is zero logic in these plagiarism machines

and

Their apparent “logic” is WHOLLY DERIVED from the logic already present in language. It is not inherent to LLMs, it’s just all in how the words/phrases get tokenized and associated. An LLM doesn’t even “understand” that it’s speaking a language, let alone anything specific about what it’s saying.

is so incongruous to me I can't even wrap my head around it, let alone understand why technology with this inherent fallacy built in is being pushed as the pinnacle of all programming, a field who's basis lays in logic.

[–] omarfw@lemmy.world 3 points 3 months ago

I can't even wrap my head around it, let alone understand why technology with this inherent fallacy built in is being pushed as the pinnacle of all programming, a field who's basis lays in logic.

Because line must go up no matter what.

[–] pantherfarber@lemmy.world 8 points 3 months ago (2 children)

From my understanding its length of the conversion that causes the breakdown. As the conversation gets longer the original system prompt that contains the guardrails is less relevant. Like the weight it puts on the responses becomes less and less as the conversation goes on. Eventually the LLM just ignores it.

[–] princessnorah@lemmy.blahaj.zone 6 points 3 months ago

I wonder if that's part of why GPT5 feels "less personal" to some users now? Perhaps they're reinjecting the system prompt during the conversation and that takes away that personalisation somewhat...

[–] shalafi@lemmy.world 2 points 3 months ago

Been tempted to fuck with ChatGPT, see what I can push it to say, but I don't want that on my "record". If it was utterly private, I'd be pushing the envelope. Be interesting to experiment with!

[–] CandleTiger@programming.dev 4 points 3 months ago (1 children)

The chatbot is at its heart a text-completion program: "given the text so far, what would a real person be likely to type next? Output that."

To get a vision of "normal", it is trained on a corpus of, essentially, every internet conversation that ever happened.

So when an emo teenager comes in with the beginning of an emo conversation about beautiful suicide, what the chatbot does is fill in the blanks to make a realistic conversation about suicide that matches the similar emo conversations it found on tumblr which are... not necessarily healthy.

The "guardrails" come in a few forms:

system prompt: All chatbots are using this. Before each session where the user starts a chat, the company feeds the chatbot a system prompt saying what the company wants the chatbot to do, for example, "Don't talk about suicide, ok? It's not healthy." This works to an extent but is easy to trick. As far as the chatbot is concerned, there is no difference between the system prompt and the rest of the conversation. It doesn't recognize any concept of authority or "system prompt came from the boss" so as the conversation gets longer and longer the system prompt at the beginning gets less and less relevant.
tuning: All chatbots are using this. After training the chatbot intensively on everything ever seen in the whole internet, give it a 2nd level of more targeted training where you rank it on "good" and "bad" -- these texts are bad, don't copy texts like this; these texts are good, do copy texts like this. This is not as targeted as the system prompt, and can have surprising side effects because what constitutes "texts like this" is not well-defined. Doesn't change the core behavior of the chatbot just wanting to complete the conversation like online example texts will do, including sick and twisted conversations.
supervisor: I don't know if this is in common use -- have one chatbot generate the text, while another chatbot which does not take information from the user watches it for "bad topics" and shuts the conversation down. These are really annoying, so companies have an incentive not to use a supervisor or to make it lenient.

[–] sigmaklimgrindset@sopuli.xyz 3 points 3 months ago

Just want to thank you for laying the terminology out so nicely. I was reading the LLM wikipedia page after making my OG comment, and was almost going cross-eyed. Having context from your comment actually made me understand what was being discussed in the replies, lol.

[–] Ilovethebomb@sh.itjust.works 4 points 3 months ago (1 children)

It's incredible to me that it even has that information.

[–] fullsquare@awful.systems 9 points 3 months ago (1 children)

it's trained on entire internet, of course everything is there. tho taking bomb-building advice from an idiot box that can't count letters in a word is gotta be an entire new type of darwin award

[–] Ilovethebomb@sh.itjust.works 5 points 3 months ago (2 children)

I mean, that's part of the issue. We trained a machine on the entire Internet, didn't vet what we fed in, and let children play with it.

[–] fullsquare@awful.systems 6 points 3 months ago

well nobody guarantees that internet is safe, so it's more on chatbot providers pretending otherwise. along with all the other lies about machine god that they're building that will save all the worthy* in the incoming rapture of the nerds, and even if it destroys everything we know, it's important to get there before the chinese.

i sense a bit of "think of the children" in your response and i don't like it. llms shouldn't be used by anyone. there was recently a case of a dude with dementia who died after fb chatbot told him to go to nyc

* mostly techfash oligarchs and weirdo cultists

[–] shalafi@lemmy.world 3 points 3 months ago

Can't see how they would get the monstrous dataset(s) required with indiscriminate vacuuming. If we want to be more discriumate on ingestion parameters, the man hours involved would be boggling.

[–] fullsquare@awful.systems 14 points 3 months ago

so how is it fundamentally different from qanon, except that it's strictly personalized this time