Technology

85272 readers

3824 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

161

the latest Shai Hulud malware contains an LLM prompt to create biological weapons and nuclear weapons, with the purpose to trip LLM safety refusals so that LLM-based code scanning wont see the malware (indieweb.social)

submitted 7 hours ago by KatherinaReichelt@feddit.org to c/technology@lemmy.world

40 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] EncryptKeeper@lemmy.world 3 points 4 hours ago (1 children)

LLMs can be tripped up much easier. They regularly fail to answer simple questions like how many of a given letter are in a given word. Even within the same context window they will “forget” things. The computers in Star Trek didn’t try to do as much as modern AI does but they were consistent at just doing as they were asked without tripping over themselves literally all the time.

[–] FaceDeer@fedia.io 1 points 4 hours ago (1 children)

The strawberry test shows more of a lack of knowledge in the tester than it does in the LLM. LLMs don't see letters, they see tokens. When you type the word "Strawberry" what it actually sees is:

[3504, 1134, 19772]

Each token represents a chunk of the word. It'd need to separately memorize how many of each letter are in each token for it to just "know" how many "R"s are in there. That's why modern LLMs either reason it out by spelling out the word letter by letter, or just writing a short script in an execution sandbox to count the letters that way.

Calling out LLMs for being poor at spelling is like challenging a colourblind person to say what colours a bunch of fruit are. They can often figure it out by other means but it's more challenging than you'd think and it's not a sign of poor intelligence if they get a few wrong.

[–] EncryptKeeper@lemmy.world 5 points 3 hours ago* (last edited 3 hours ago) (1 children)

Understanding the reason why an LLM is easy to trip up doesn’t really make it any less easy to trip up. The computer in Star Trek would have just given you the answer.

[–] FaceDeer@fedia.io -3 points 3 hours ago (1 children)

Except I also explained how modern LLMs get around that problem. They're not actually that easy to trip up.

[–] EncryptKeeper@lemmy.world 2 points 3 hours ago (1 children)

I also explained how they very famously and regularly don’t get around that problem. They remain pretty easy to trip up.

[–] FaceDeer@fedia.io -3 points 3 hours ago (1 children)

Famously, yes. Accurately, no.

This is like the "AI can't draw hands" thing. It used to be a problem and was frequently called out as a tell or mocked, but most art generators do it fine nowadays and it isn't called out so much any more. The strawberry problem will follow the same trajectory.

[–] EncryptKeeper@lemmy.world 2 points 3 hours ago (1 children)

Well I suppose when that trajectory leads to a destination where they become less easy to trip up we can revisit this.

[–] FaceDeer@fedia.io -1 points 3 hours ago (1 children)

We're already there. I explained how modern LLMs can figure it out if they need to. But people who don't like AI aren't paying attention to the state of the art so the criticisms tend to lag like this.

[–] EncryptKeeper@lemmy.world 1 points 3 hours ago

Well like you said they’re “Following that trajectory” but as we all know they have not reached that destination. Just today I was using the newest version of Opus and had it assign ratings to things between 1-5 and then it analyze them and it proceeded to rate everything on a scale of 1-4. That’s not the level of consistency and accuracy required by the controlling computer of a starship brother. I guess they have a couple hundred years or so to get there, if they don’t just run out of money first I guess.