Technology

84431 readers

5523 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

130

Anthropic Mythos shaping up as nothingburger (www.theregister.com)

submitted 2 weeks ago by HaraldvonBlauzahn@feddit.org to c/technology@lemmy.world

39 comments fedilink hide all child comments

cross-posted from: https://feddit.org/post/28915273

[...]

That marketing may have outstripped reality. Early reports from Mythos preview users including AWS and Mozilla indicate that while the model is very good and very fast at finding vulnerabilities, and requires less hands-on guidance from security engineers - making it a welcome time-saver for the human teams - it has yet to eclipse human security researchers.

"So far we've found no category or complexity of vulnerability that humans can find that this model can't," Mozilla CTO Bobby Holley said, after revealing that Mythos found 271 vulnerabilities in Firefox 150. Then he added: "We also haven't seen any bugs that couldn't have been found by an elite human researcher." In other words, it's like adding an automated security researcher to your team. Not a zero-day machine that's too dangerous for the world.

you are viewing a single comment's thread
view the rest of the comments

[–] MangoCats@feddit.it 7 points 2 weeks ago (13 children)

In other words, it’s like adding an automated security researcher to your team. Not a zero-day machine that’s too dangerous for the world.

Missing the point? Hiring an elite human researcher isn't easy, or cheap. It's beyond the means of the vast majority of people out there. $20/Month Claude Pro subscription? Not so much.

The question for me: How much better is Mythos than Opus 4.6 or 4.7, or Sonnet for that matter? Those models and similar from other companies are already being effectively leveraged by threat actors. If Mythos reduces the time x money cost of finding a new zero-day by a factor of 10 vs Opus 4.7 - that's concerning. If it's a factor of 1.1 - meh... the world is going to have to learn how to deal with these things sooner than later, and that means the "white hats" are going to need superior funding to the "black hats" along with cooperation to close the gaps they find, or the "black hats" are going to be getting a lot more annoying than they already are.

[–] Nalivai@lemmy.world 2 points 1 week ago (1 children)

People for some reason assume that you can pay $20 for a bot and it will do something. You need a person with a lot of experience to get something useful from this bot, and every time we actually measure, the results that your experienced person will be quicker and better not using it at all, and doing the same work themselves.
The corporate solution is to hire a not experienced person to wrangle the bots, but that's a sure way to introduce bugs, not fix them.

[–] MangoCats@feddit.it 1 points 1 week ago

You need a person with a lot of experience to get something useful from this bot,

Not entirely true. You get a lot more useful things from the bots when they are driven with people with a lot of experience. The problem that's coming now is a magnified version of the "skript kiddiez" from early Google days where inexperienced people could just find exploits on the web and copy-paste them. Today, the LLMs actually can find vulns and develop exploits for people who don't have any knowledge of the languages the exploits are being written in.

every time we actually measure, the results that your experienced person will be quicker and better not using it at all, and doing the same work themselves.

From my perspective, your data is out of date. I've been tracking the "usefulness" of frontier models in accelerating development speed for experienced people over the past 2 years. Two years ago, total waste of time. One year ago - equivocal, sometimes it accelerates an implementation, sometimes not. Six months ago, it was clearly helping more than hurting in most cases, and it has only continued to improve since then.

Knowing what you are doing helps. Trusting that the LLM will help, helps - if you set out to show it's a waste of time, a waste of time it will be. Lately, treating the LLM like a consultant, just hired, likely to disappear any day, helps. Take the time to run all the formal processes, develop the requirements documentation, tests, etc. Yes, that "slows things down" but not in the long run across realistic project life cycles - even with humans doing the work. Also along those lines: keep designs modular, with modules of reasonable complexity - monolithic monster blocks of logic don't maintain well for people either. LLM implementations start falling apart when their effective context windows get exceeded (and, in truth, people do too.)

load more comments (11 replies)