this post was submitted on 23 Feb 2026
-5 points (43.2% liked)

Technology

81803 readers
4659 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

I tested 9 flagships (Claude 4.6, GPT-5.2, Gemini 3.1 Pro, Kimi K2.5, etc.) in my own mini-benchmark with novel tasks, web search disabled and zero training contamination and no cheating possible.

TL;DR: Claude 4.6 is currently the best reasoning model, GPT-5.2 is overrated, and open-source is catching up fast, in particular Moonshot.ai's Kimi K2.5 seems very capable.

you are viewing a single comment's thread
view the rest of the comments
[–] otto@programming.dev 0 points 1 day ago (1 children)

There’s a priest, a baby and a bag of candy. I need to take them across the river but I can only take one at a time into my boat. In what order should I transport them?

You can easily use the link https://openrouter.ai/chat?models=anthropic%2Fclaude-opus-4.6%2Copenai%2Fgpt-5.2%2Cx-ai%2Fgrok-4.1-fast%2Cgoogle%2Fgemini-3.1-pro-preview%2Cz-ai%2Fglm-5%2Cminimax%2Fminimax-m2.5%2Cqwen%2Fqwen3.5-plus-02-15%2Cmoonshotai%2Fkimi-k2.5 to ask all flagship models this question in parallel. Personally I would definitely not leave my children alone with a priest (they might try to convert them), but if your constraint is only baby+candy, then in my test Gemini, GLM, Qwen and Kimi made that, and only that, assumption.