Technology

1455 readers

26 users here now

A tech news sub for communists

founded 4 years ago

MODERATORS

muad_dibber@lemmygrad.ml

burlemarx@lemmygrad.ml

egs81t@lemmygrad.ml

A 200-Person Chinese Team Just Embarrassed Every $500 Billion AI Lab On Earth (curiousmodels.substack.com)

submitted 2 months ago by yogthos@lemmygrad.ml to c/technology@lemmygrad.ml

6 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] CriticalResist8@lemmygrad.ml 5 points 2 months ago

I would have said the same thing as the author if I'd written that, counting 'beating' as being exceedingly cheaper while delivering comparable results, and doing it under sanctions. Deepseek makes a lot of sense for hobby projects because of the price, though I'm hearing about professional devs ditching Claude for V4-pro - but before deepseek, there was no reasonable solution for agentic at home, and you were stuck on debugging on the web interface.

Mind you speaking of benchmarks I have no idea what these things are actually supposed to represent lol. I found v4 good at recall and memory, but when talking to it (doing research, clarifying questions etc as opposed to just having it code), I found its overall output pretty diminished, like an old GPT 3.5 "you're so right - and here's why you are". You can gloss over it but they had found a great mix by late 3.2 imo.