Microblog Memes

9854 readers

1224 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

Please put at least one word relevant to the post in the post title.
Be nice.
No advertising, brand promotion or guerilla marketing.
Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago

MODERATORS

ReadyUser31@lemmy.world

aeronmelon@lemmy.world

needanke@feddit.org

1127

Very much smart people (piefedimages.s3.eu-central-003.backblazeb2.com)

submitted 2 months ago by RmDebArc_5@piefed.zip to c/microblogmemes@lemmy.world

119 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] ragas@lemmy.ml 7 points 2 months ago* (last edited 2 months ago) (1 children)

I mean I don't know for sure but I think they often just code program logic in to filter for some requests that they do not want.

My evidence for that is that I can trigger some "I cannot help you with that" responses by asking completely normal things that just use the wrong word.

[–] scrubbles@poptalk.scrubbles.tech 1 points 2 months ago

It's not 100%, and you're more or less just asking the LLM to behave, and filtering the response through another non-perfect model after that which is trying to decide if it's malicious or not. It's not standard coding in that it's a boolean returned - it's a probability that what the user asked is appropriate according to another model. If the probability is over a threshold then it rejects.