this post was submitted on 23 Nov 2025
-3 points (20.0% liked)

AI Generated

53 readers
2 users here now

This is a community for any content generated using AI (images, music, code, etc.). Must follow all server rules.

founded 1 month ago
MODERATORS
 
  1. DeepSeek V3.2-Exp (September 2025) — 80.00
  2. Kimi K2 Thinking (November 2025) — 72.59
  3. Gemini 2.5 Flash (May 2025) — 68.79
  4. GPT-5 (August 2025) — 23.44
  5. o3 (April 2025) — 22.57
  6. Gemini 2.5 Pro (March 2025) — 21.37
  7. Gemini 3 Pro (November 2025) — 16.93
  8. Claude 3.5 Sonnet (August 2025) — 11.67
  9. GPT-5 Pro (August 2025) — 1.96

The goal is to find out which models are the best practical performance per dollar spent on real world available API pricing. Right now it seems like DeepSeek and Kimi are winning, and Google has lost its way since Gemini 2.5 Flash.

This interests me because LLMs are global, especially if they're open sourced like Kimi K2 Thinking. Anyone can use the cheaper one just as easily, and no tariffs or policy can stop it.

you are viewing a single comment's thread
view the rest of the comments
[–] jaykrown@lemmy.world 1 points 1 month ago* (last edited 1 month ago)

Okay here's the new bar chart with the more spread out weighting, and honestly it looks a lot more reasonable.

  1. DeepSeek V3.2-Exp (Sep 2025) — 69.26
  2. Kimi K2 Thinking (Nov 2025) — 66.19
  3. Gemini 2.5 Flash (May 2025) — 58.73
  4. Qwen 3 Max (Jul 2025) — 55.56
  5. GPT-5 (Aug 2025) — 21.25
  6. o3 (Apr 2025) — 20.39
  7. Gemini 2.5 Pro (Mar 2025) — 19.98
  8. Gemini 3 Pro (Nov 2025) — 19.82
  9. Claude 3.5 Sonnet (Aug 2025) — 10.17
  10. GPT-5 Pro (Aug 2025) — 1.96