32
submitted 2 months ago by git@hexbear.net to c/technology@hexbear.net
you are viewing a single comment's thread
view the rest of the comments
[-] KnilAdlez@hexbear.net 3 points 2 months ago

Exactly! PCs today are powerful enough to run them in decent time without acceleration too, it would just be more efficient to have it, ultimately saving time and energy. I would be interested in seeing how much processing power is wasted to calculate what are effectively edge cases in a models real work load. What percentage of GPT-4 queries could not be answered accurately by GPT-3 or a local LLaMA model? I'm willing to bet it's less than 10%. Terawatt-hours and hundreds of gallons of water to run a model that, for 90% of users, could be ran locally.

this post was submitted on 30 Jul 2024
32 points (100.0% liked)

technology

23238 readers
238 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 4 years ago
MODERATORS