this post was submitted on 17 Jan 2026
90 points (100.0% liked)

technology

24173 readers
123 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[โ€“] fox@hexbear.net 6 points 1 day ago* (last edited 1 day ago) (1 children)

You'd think, but efficiency gains are erased by the LLMs having bigger context windows and self-referencing "thinking" or "agent" modes that massively extend token burn. There's public data out there showing how training costs are an enormous fixed point, but then inference costs very quickly catch up and exceed the training cost.

A model that's token-efficient is a model that's pretty useless and a model that's useable for anything is so inefficient as to have massively negative profit margins. If there was even one model out there that was cost effective for the number of tokens burned, the provider would never shut up about it to buyers

Wow, really? I guess context windows have been going up but did not realise they were so ruinously expensive. Where can I read more about this?