I don't care how rough the estimate is, LLMs are using insane amounts of power, and the message I'm getting here is that the newest incarnation uses even more.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
This is a most excellent place for technology news and articles.
I don't care how rough the estimate is, LLMs are using insane amounts of power, and the message I'm getting here is that the newest incarnation uses even more.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
Kind of? Inefficient coding is definitely a part of it. But a large part is also just the iterative nature of how these algorithms operate. We might be able to improve that via code optimization a little bit. But without radically changing how these engines operates it won't make a big difference.
The scope of the data being used and trained on is probably a bigger issue. Which is why there's been a push by some to move from LLMs to SLMs. We don't need the model to be cluttered with information on geology, ancient history, cooking, software development, sports trivia, etc if it's only going to be used for looking up stuff on music and musicians.
But either way, there's a big 'diminishing returns' factor to this right now that isn't being appreciated. Typical human nature: give me that tiny boost in performance regardless of the cost, because I don't have to deal with. It's the same short-sighted shit that got us into this looming environmental crisis.
Coordinated SLM governors that can redirect queries to the appropriate SLM seems like a good solution.
And water usage which will also increase as fires increase and people have trouble getting access to clean water
https://techhq.com/news/ai-water-footprint-suggests-that-large-language-models-are-thirsty/
It would only take one regulation to fix that:
Datacenters that use liquid cooling must use closed loop systems.
The reason they dont, and why they setup in the desert, is because water is incredibly cheap and energy to cool a closed loop system is expensive. So they use evaporative open loop systems.
Unfortunately I wonder if it’s more expensive to set up a closed loop system that’s really expensive or to buy lawmakers that will vote against bills saying you should do so and it’s a tale old as time
I have an extreme dislike for OpenAI, Altman, and people like him, but the reasoning behind this article is just stuff some guy has pulled from his backside. There's no facts here, it's just "I believe XYX" with nothing to back it up.
We don't need to make up nonsense about the LLM bubble. There's plenty of valid enough criticisms as is.
By circulating a dumb figure like this, all you're doing is granting OpenAI the power to come out and say "actually, it only uses X amount of power. We're so great!", where X is a figure that on its own would seem bad, but compared to this inflated figure sounds great. Don't hand these shitty companies a marketing win.
I think AI power usage has an upside. No amount of hype can pay the light bill.
AI is either going to be the most valuable tech in history, or it's going to be a giant pile of ash that used to be VC capital.
It will not go away at this point. Too many daily users already, who uses it for study, work, chatting, looking things up.
If not OpenAI, it will be another service.
Those same things were said about hundreds of other technologies that no longer exist in any meaningful sense. Current usage of a technology, which in this specific case I would argue is largely frivolous anyway, is not an accurate indicator of future usage.
Those users are not paying a sustainable price, they're using chatbots because they're kept artificially cheap to increase use rates.
Force them to pay enough to make these bots profitable and I guarantee they'll stop.
Bit of a clickbait. We can't really say it without more info.
But it's important to point out that the lab's test methodology is far from ideal.
The team measured GPT-5’s power consumption by combining two key factors: how long the model took to respond to a given request, and the estimated average power draw of the hardware running it.
What we do know is that the price went down. So this could be a strong indication the model is, in fact, more energy efficient. At least a stronger indicator than response time.
That's a terrible metric. By this providers that maximize hardware (and energy) use by having a queue of requests would be seen as having more energy use.
Fucking Doc Brown could power a goddamn time machine with this many jiggawatts, fuck I hate being stuck in this timeline.
There's such a huge gap between what I read about GPT-5 online, versus the overwhelmingly disappointing results I get from it for both coding and general questions.
I'm beginning to think we're in the end stages of Dead Internet, where basically nothing you see online has any connection to reality.
People who fawn over generative AI haven't tried to use it for more than 5 seconds. I wish it could run a ttrpg game for me or even just remember the details of its original prompt but its not even close.
And an LLM that you could run local on a flash drive will do most of what it can do.
I mean no not at all, but local LLMs are a less energy reckless way to use AI
Probably not a flash drive but you can get decent mileage out of 7b models that run on any old laptop for tasks like text generation, shortening or summarizing.
For reference, this is roughly equivalent to playing a PS5 game for 4 minutes (based on their estimate) to 10 minutes (their upper bound)
calulation
source https://www.ecoenergygeek.com/ps5-power-consumption/
Typical PS5 usage: 200 W
TV: 27 W - 134 W → call it 60 W
URI's estimate: 18 Wh / 260 W → 4 minutes
URI's upper bound: 48 Wh / 260 W →10 minutes
How the hell are they going to sustain the expense to power that? Setting aside the environmental catastrophe that this kind of "AI" entails, they're just not very profitable.
Look at all the layoffs they've been able to implement with the mere threat that AI has taken their jobs. It's very profitable, just not in a sustainable way. But sustainability isn't the goal. Feudal state mindset in the populace is.
I don’t buy the research paper at all. Of course we have no idea what OpenAI does because they aren’t open at all, but Deepseek's publish papers suggest it’s much more complex than 1 model per node… I think they recommended like a 576 GPU cluster, with a scheme to split experts.
That, and going by the really small active parameter count of gpt-oss, I bet the model is sparse as heck.
There’s no way the effective batch size is 8, it has to be waaay higher than that.
And perhaps even more importantly, the per-token cost of GPT-5's API is less than GPT-4's. That's why OpenAI was so eager to move everyone onto it, it means more profit for them.
I don’t believe api costs are tied all that closely to the actual cost to openAI. They seem to be selling at a loss, and they may be selling at an even greater loss to make it look like they are progressing. The second openAI seems like they have plateaued, their stock evaluation will crash and it will be game over for them.
Of course there are comments doubting the accuracy, which by itself is valid, but they are merely doing it to defend AI. IMHO, even at a fifth of the estimates, we’re talking humongous amounts of power, all for a so-so search engine, half arsed chatbots and dubious nsfw images mostly. And let’s not forget: it may be inaccurate and estimates are TOO LOW. Now wouldn’t that be fun?
but they are merely doing it to defend AI.
No they're not, you can agree the research is garbage without defending AI. It literally assumes everything. GPT5 could be using eight times the power. It could be using half the power. It could be using a quadrillion times the power. Nobody knows, because they keep it secret.
Isn't this the back plot of the game, Rain World? With the slug cats and the depressed robots stuck on a decaying world when the sapient, organic species all left?
Spoilers dude.
That's alright. When they've got a generation of people who can't even hold a conversation without it, let alone do a job, that price increase will drop that energy use pretty rapidly.
This bubble needs to pop, the sooner the better.
40Wh or 18Wh which is it?
That's my old gaming PC running a game for 2min42sec-6minutes ... Roughly.
The last 6 to 12 months of open models has pretty clearly shown you can substantially better results with the same model size or the same results with smaller model size. Eg Llama 3. 1 405B being basically equal to Llama 3.3 70B or R1-0528 being substantially better than R1. The little information available about GPT 5 suggests it uses mixture of experts and dynamic routing to different models, both of which can reduce computation cost dramatically. Additionally, simplifying the model catalogue from 9ish(?) to 3, when combined with their enormous traffic, will mean higher utilization of batch runs. Fuller batches run more efficiently on a per query basis.
Basically they can't know for sure.