this post was submitted on 05 May 2026
53 points (96.5% liked)
technology
24367 readers
320 users here now
On the road to fully automated luxury gay space communism.
Spreading Linux propaganda since 2020
- Ways to run Microsoft/Adobe and more on Linux
- The Ultimate FOSS Guide For Android
- Great libre software on Windows
- Hey you, the lib still using Chrome. Read this post!
Rules:
- 1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
- 2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
- 3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
- 4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
- 5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
- 6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
- 7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It's strange to me to call out the climate costs, especially since this would have far less climate impact than having to call to google's servers to use the full sized models. LLMs aren't magically worse for the environment, its the hardware they run on for the full sized models that is incredibly power hungry. A 4 GB LLM would probably use less power than a modern video game on a person's computer, and run for less time.
I was thinking the same thing. The article claims 30,000 to 60,000 tons of CO2e emissions from sending 4GB of data to hundreds of millions of phones. For reference, estimates for the US AI industry's emissions are between 30 and 80 million tons per year, global total emissions are 80 billion. How often this gets repeated for new versions is unclear.
As for inference, Chrome won't even use it for its biggest use case: everything done via the search bar and "AI mode" is still sent to high-param models in Google's servers, likely because user data is Alphabet's cash cow. The local model is only used in very niche cases:
Not magically, just technically. Within a couple years we just made computers use significantly more energy for no good reason. This shit being into everything is incredibly unpopular.
I'm not sure if I understand you, it sounds like you think running a small llm locally on your computer will suddenly make it use like 10x more power. That's not how it works. It's the servers used to run the full sized models that use that much power, as each one has tens of thousands of processors running at once. And local llms do have usage, especially for accessability. I use a local llm for my home assistant instance so I can use voice commands, which is very helpful as a disabled peraon.
You seem to have no idea how good modern computers are at idling
What does that have to do with anything?
I think their point is that regular web browsing will use less power than web browsing with local LLM calls. Your PC running an LLM is likely gonna hit its TDP limits, while browsing will be a fraction of that. Yes it’s less power than used by a trillion parameter model but I think their point is it’s vastly more than your non-LLM standard browsing would be
Debatable for a 4GB model, depending on the hardware. It's also (most likely) not constantly running, so while yes, it will use more power than not having it, whether or not it is a significant change in the long run depends on many factors.
It's an easy angle to win over the unaware general public by hitting points like climate and water usage
Dishonest, and a little silly to people who understand the math but it is an angle