Technology

1420 readers

49 users here now

A tech news sub for communists

founded 3 years ago

MODERATORS

muad_dibber@lemmygrad.ml

burlemarx@lemmygrad.ml

egs81t@lemmygrad.ml

A 200-Person Chinese Team Just Embarrassed Every $500 Billion AI Lab On Earth (curiousmodels.substack.com)

submitted 1 day ago by yogthos@lemmygrad.ml to c/technology@lemmygrad.ml

6 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] chesmotorcycle@lemmygrad.ml 12 points 1 day ago (1 children)

OpenAI is spending $500 billion on data centers. Google has entire campuses of supercomputers. Meta hired every genius on the planet. And then a hedge fund guy from Hangzhou with 200 kids fresh out of university just casually dropped a model that beats them all — and then open sourced it with the full recipe.

Let that sink in. Not “almost as good.” Not “competitive.” Beats them — on math olympiads, on coding, on long-context retrieval — while using a fraction of the compute. And then they uploaded it to Hugging Face for free. For. Free.

The model is Deepseek V4, available here.

[–] PoY@lemmygrad.ml 6 points 18 hours ago (2 children)

i dunno where the line that it beats all the other models comes from when their own stats show they do worse across the board against the other big models, but they are close

[–] CriticalResist8@lemmygrad.ml 4 points 16 hours ago

I would have said the same thing as the author if I'd written that, counting 'beating' as being exceedingly cheaper while delivering comparable results, and doing it under sanctions. Deepseek makes a lot of sense for hobby projects because of the price, though I'm hearing about professional devs ditching Claude for V4-pro - but before deepseek, there was no reasonable solution for agentic at home, and you were stuck on debugging on the web interface.

Mind you speaking of benchmarks I have no idea what these things are actually supposed to represent lol. I found v4 good at recall and memory, but when talking to it (doing research, clarifying questions etc as opposed to just having it code), I found its overall output pretty diminished, like an old GPT 3.5 "you're so right - and here's why you are". You can gloss over it but they had found a great mix by late 3.2 imo.

[–] chesmotorcycle@lemmygrad.ml 4 points 16 hours ago

I thought that claim wasn't quite right. Maybe the author was just cherry picking certain stats or benchmarks.